Authors
Aditya Joshi, AR Balamurali, Pushpak Bhattacharyya
Publication date
2010/4
Journal
Proceedings of the 8th ICON
Description
Sentiment Analysis (SA) research has gained tremendous momentum in recent times. However, there has been little work in this area for an Indian language. We propose in this paper a fall-back strategy to do sentiment analysis for Hindi documents, a problem on which, to the best of our knowledge, no work has been done until now.(A) First of all, we study three approaches to perform SA in Hindi. We have developed a sentiment annotated corpora in the Hindi movie review domain. The first of our approaches involves training a classifier on this annotated Hindi corpus and using it to classify a new Hindi document.(B) In the second approach, we translate the given document into English and use a classifier trained on standard English movie reviews to classify the document.(C) In the third approach, we develop a lexical resource called Hindi-SentiWordNet (H-SWN) and implement a majority score based strategy to classify the given document.
A comparison of performance of these approaches implies that we can adopt a fallback strategy for doing sentiment analysis for a new language, viz.,(1) Train a sentiment classifier on in-language labeled corpus and use this classifier to classify a new document.(2) If in-language training data is not available, apply rough machine translation to translate the new document into a resource-rich language like English and detect the polarity of the translated document using a classifier for English, assuming polarity is not lost in translation.(3) If the translation cannot be done, put in
Total citations
20132014201520162017201820192020202120222023202431116311222252625211513
Scholar articles
A Joshi, AR Balamurali, P Bhattacharyya - Proceedings of the 8th ICON, 2010