TY - GEN
T1 - Data Augmentation for Historical Documents via Cascade Variational Auto-Encoder
AU - Cao, Guanyu
AU - Kamata, Sei Ichiro
N1 - Funding Information:
ACKNOWLEDGMENT This work was partially supported by JSPS KAEKNHI Grant Number 18K11380.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - In this paper, we introduce a novel model based on Variational Auto-Encoder (VAE) that is able to find subclasses of categories and generate new samples with fidelity to the subclass in an unsupervised way. In generating characters from historical documents, this model helps generated characters to avoid ambiguity in the case where there are multiple writing styles of one character without being labeled. With this model we augment historical Japanese document dataset to make it more balanced. The model is trained in two steps. In the first step, the model learns the data distribution and learns to map character images into basic shape vectors. In the second step, the model learns to generate new samples conditioned on the basic shape vectors. The generated samples are more robust against intra-class multi-modality. With the usage of augmented dataset, the recognition rate is improved. Ablation study is performed to evaluate the effectiveness of data augmentation.
AB - In this paper, we introduce a novel model based on Variational Auto-Encoder (VAE) that is able to find subclasses of categories and generate new samples with fidelity to the subclass in an unsupervised way. In generating characters from historical documents, this model helps generated characters to avoid ambiguity in the case where there are multiple writing styles of one character without being labeled. With this model we augment historical Japanese document dataset to make it more balanced. The model is trained in two steps. In the first step, the model learns the data distribution and learns to map character images into basic shape vectors. In the second step, the model learns to generate new samples conditioned on the basic shape vectors. The generated samples are more robust against intra-class multi-modality. With the usage of augmented dataset, the recognition rate is improved. Ablation study is performed to evaluate the effectiveness of data augmentation.
KW - Conditional Generation
KW - Dataset Augmentation
KW - Unsupervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85084749914&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084749914&partnerID=8YFLogxK
U2 - 10.1109/ICSIPA45851.2019.8977737
DO - 10.1109/ICSIPA45851.2019.8977737
M3 - Conference contribution
AN - SCOPUS:85084749914
T3 - Proceedings of the 2019 IEEE International Conference on Signal and Image Processing Applications, ICSIPA 2019
SP - 340
EP - 345
BT - Proceedings of the 2019 IEEE International Conference on Signal and Image Processing Applications, ICSIPA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE International Conference on Signal and Image Processing Applications, ICSIPA 2019
Y2 - 17 September 2019 through 19 September 2019
ER -