Context-aware multimodal Recurrent Neural Network for automatic image captioning
dc.contributor.advisor | Grootjen, F.A. | |
dc.contributor.advisor | Versteeg, R. | |
dc.contributor.author | Rijn, F.W. van | |
dc.date.issued | 2016-08-25 | |
dc.description.abstract | Automatic image captioning is a state-of-the-art computer vision task where any image can be described with text. There are cases where an image is supported by text in for instance books or news articles. In this study a context-aware model is proposed that uses not only the image, but also the text surrounding the image to generate a description. The model uses a joint LSTM with attention on both the image and the context and is trained on the Microsoft COCO dataset. This study also explored several setups to represent the text into a feature vector. Results show quantitative and qualitative improvements when context is included. Future directions are automating the feature crafting as well as applying the model to more datasets. | en_US |
dc.identifier.uri | http://hdl.handle.net/123456789/2634 | |
dc.language.iso | en | en_US |
dc.thesis.faculty | Faculteit der Sociale Wetenschappen | en_US |
dc.thesis.specialisation | Master Artificial Intelligence | en_US |
dc.thesis.studyprogramme | Artificial Intelligence | en_US |
dc.thesis.type | Master | en_US |
dc.title | Context-aware multimodal Recurrent Neural Network for automatic image captioning | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Rijn, van,F._MSc_Thesis.pdf
- Size:
- 5.18 MB
- Format:
- Adobe Portable Document Format
- Description:
- Thesis text