Data Augmentation for Ancient Characters via Semi-MixFontGan

Zhiyi Yuan, Sei Ichiro Kamata

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

The ancient documents provide people a way to understand history. However, the existing materials are suffering from unbalanced characters dataset, as well as intra-class multimodality fonts. As a result, humans and recognition systems are unable to identify these characters effectively. Based on these problems, we propose Semi-MixFontGan: a font generation method based on Semi-Supervised strategy that can learn from a small number of labeled font data to aggregate subclasses' information of categories and generate characters. In generating new samples from ancient books that have a small amount of labeled font data, the model can automatically learn the difference between them and generate font-consistent characters. The model is composed of two parts. In the first part, we propose a MixFont method to mix labeled and unlabeled and generated data. Then use a convolutional autoencoder to learn the font information. In the second part, the generator network can generate reasonable and realistic images by Font and Content Discriminator. Through this model, we can make the ancient book dataset more balanced. Experiments show that the generated characters by our model can get good visual effects and maintain font consistency with training data. With the augmented data, the accuracy of the recognition network has increased. Contribution-We propose a novel font generation method with semi-supervised learning to generate characters from small labeled font Kuzushiji dataset.

本文言語English
ホスト出版物のタイトル2020 Joint 9th International Conference on Informatics, Electronics and Vision and 2020 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781728193311
DOI
出版ステータスPublished - 2020 8月 26
イベントJoint 9th International Conference on Informatics, Electronics and Vision and 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020 - Kitakyushu, Japan
継続期間: 2020 8月 262020 8月 29

出版物シリーズ

名前2020 Joint 9th International Conference on Informatics, Electronics and Vision and 2020 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020

Conference

ConferenceJoint 9th International Conference on Informatics, Electronics and Vision and 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020
国/地域Japan
CityKitakyushu
Period20/8/2620/8/29

ASJC Scopus subject areas

  • 人工知能
  • コンピュータ ビジョンおよびパターン認識
  • 情報システム
  • 電子工学および電気工学
  • 器械工学

フィンガープリント

「Data Augmentation for Ancient Characters via Semi-MixFontGan」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル