TY - GEN
T1 - A new pre-classification method based on associative matching method
AU - Katsuyama, Yutaka
AU - Minagawa, Akihiro
AU - Hotta, Yoshinobu
AU - Omachi, Shinichiro
AU - Kato, Nei
PY - 2010
Y1 - 2010
N2 - Reducing the time complexity of character matching is critical to the development of efficient Japanese Optical Character Recognition (OCR) systems. To shorten processing time, recognition is usually split into separate preclassification and recognition stages. For high overall recognition performance, the pre-classification stage must both have very high classification accuracy and return only a small number of putative character categories for further processing. Furthermore, for any practical system, the speed of the pre-classification stage is also critical. The associative matching (AM) method has often been used for fast pre-classification, because its use of a hash table and reliance solely on logical bit operations to select categories makes it highly efficient. However, redundant certain level of redundancy exists in the hash table because it is constructed using only the minimum and maximum values of the data on each axis and therefore does not take account of the distribution of the data. We propose a modified associative matching method that satisfies the performance criteria described above but in a fraction of the time by modifying the hash table to reflect the underlying distribution of training characters. Furthermore, we show that our approach outperforms pre-classification by clustering, ANN and conventional AM in terms of classification accuracy, discriminative power and speed. Compared to conventional associative matching, the proposed approach results in a 47% reduction in total processing time across an evaluation test set comprising 116,528 Japanese character images.
AB - Reducing the time complexity of character matching is critical to the development of efficient Japanese Optical Character Recognition (OCR) systems. To shorten processing time, recognition is usually split into separate preclassification and recognition stages. For high overall recognition performance, the pre-classification stage must both have very high classification accuracy and return only a small number of putative character categories for further processing. Furthermore, for any practical system, the speed of the pre-classification stage is also critical. The associative matching (AM) method has often been used for fast pre-classification, because its use of a hash table and reliance solely on logical bit operations to select categories makes it highly efficient. However, redundant certain level of redundancy exists in the hash table because it is constructed using only the minimum and maximum values of the data on each axis and therefore does not take account of the distribution of the data. We propose a modified associative matching method that satisfies the performance criteria described above but in a fraction of the time by modifying the hash table to reflect the underlying distribution of training characters. Furthermore, we show that our approach outperforms pre-classification by clustering, ANN and conventional AM in terms of classification accuracy, discriminative power and speed. Compared to conventional associative matching, the proposed approach results in a 47% reduction in total processing time across an evaluation test set comprising 116,528 Japanese character images.
KW - Associative matching method
KW - Clustering
KW - Hash table
KW - OCR
KW - Pre-classification
UR - http://www.scopus.com/inward/record.url?scp=77949797724&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77949797724&partnerID=8YFLogxK
U2 - 10.1117/12.838842
DO - 10.1117/12.838842
M3 - Conference contribution
AN - SCOPUS:77949797724
SN - 9780819479273
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Proceedings of SPIE-IS and T Electronic Imaging - Document Recognition and Retrieval XVII
T2 - Document Recognition and Retrieval XVII
Y2 - 19 January 2010 through 21 January 2010
ER -