Text-guided attention model for image captioning
- Text-guided attention model for image captioning
- Mun, J.; Cho, M.; Han, B.
- Date Issued
- AAAI press
- Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images. On the other hand, recent studies show that language associated with an image can steer visual attention in the scene during our cognitive process. Inspired by this, we introduce a text-guided attention model for image captioning, which learns to drive visual attention using associated captions. For this model, we propose an exemplarbased learning approach that retrieves from training data associated captions with each image, and use them to learn attention on visual features. Our attention model enables to describe a detailed state of scenes by distinguishing small or confusable objects effectively. We validate our model on MSCOCO Captioning benchmark and achieve the state-of-theart performance in standard metrics. ? Copyright 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
- Article Type
- 31st AAAI Conference on Artificial Intelligence, AAAI 2017, page. 4233 - 4239, 2017-02
- Files in This Item:
- There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.