Authors
Srirangaraj Setlur, Suryaprakash Kompalli, Vemulapati Ramanaprasad, Venugopal Govindaraju
Publication date
2003/3/10
Conference
Research Issues in Data Engineering: Multi-lingual Information Management, 2003. RIDE-MLIM 2003. Proceedings. 13th International Workshop on
Pages
55-61
Publisher
IEEE
Description
The Indian subcontinent has a large number of languages, dialects, and scripts with the Devanagari script being the primary and most widely used of all the scripts. To date, much of the Devanagari optical character recognition (OCR) research has been restricted to a handful of groups. So, techniques have not yet been widely disseminated or evaluated independently and automated evaluation tools are currently not available for lack of a standard representation of ground-truth and result data. A key reason for the absence of sustained research efforts in off-line Devanagari OCR appears to be the paucity of data resources. Ground truthed data for words and characters, on-line dictionaries, corpora of text documents and reliable, standardized statistical analyses and evaluation tools are currently lacking. So, the creation of such data resources will undoubtedly provide a much needed fillip to researchers working on …
Total citations
200420052006200720082009201020112012201320142015201620172018201920203122123111211
Scholar articles
S Setlur, S Kompalli, V Ramanaprasad, V Govindaraju - Proceedings. Seventeenth Workshop on Parallel and …, 2003