Authors
Andrew Busch, Wageeh W Boles, Sridha Sridharan
Publication date
2005/9/26
Journal
IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume
27
Issue
11
Pages
1720-1732
Publisher
IEEE
Description
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.
Total citations
200620072008200920102011201220132014201520162017201820192020202120224121616261691425211519109887
Scholar articles
A Busch, WW Boles, S Sridharan - IEEE Transactions on Pattern Analysis and Machine …, 2005