View article

[PDF] from aclanthology.org

Corpus-guided sentence generation of natural images

Authors

Yezhou Yang, Ching Lik Teo, Hal Daumé III, Yiannis Aloimonos

Publication date

2011/7/27

Conference

Proceedings of the Conference on Empirical Methods in Natural Language Processing

Pages

444-454

Publisher

Association for Computational Linguistics

Description

We propose a sentence generation strategy that describes images by predicting the most likely nouns, verbs, scenes and prepositions that make up the core sentence structure. The input are initial noisy estimates of the objects and scenes detected in the image using state of the art trained detectors. As predicting actions from still images directly is unreliable, we use a language model trained from the English Gigaword corpus to obtain their estimates; together with probabilities of co-located nouns, scenes and prepositions. We use these estimates as parameters on a HMM that models the sentence generation process, with hidden nodes as sentence components and image detections as the emissions. Experimental results show that our strategy of combining vision and language produces readable and descriptive sentences compared to naive strategies that use vision alone.

Total citations

Cited by 519

201220132014201520162017201820192020202120222023202413 26 33 47 50 57 51 50 41 50 39 37 21

Scholar articles

Corpus-guided sentence generation of natural images

Y Yang, C Teo, H Daumé III, Y Aloimonos - Proceedings of the 2011 conference on empirical …, 2011

HD III, and Y*

Y Yang, CL Teo - Aloimonos. Corpus-guided sentence generation of …, 2011

Cited by 8 Related articles