Chinese Word Segmentation by Mining Maximized Substrings

Mo Shen, Daisuke Kawahara, Sadao Kurohashi

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

A major problem in the field of Chinese word segmentation is the identification of out-of-vocabulary words. We propose a simple yet effective approach for extracting maximized substrings, which provide good estimations of unknown word boundaries. We also develop a new semi-supervised segmentation technique that incorporates retrieved substrings using discriminative learning. The effectiveness of this novel approach is demonstrated through experiments using both in-domain and out-of-domain data.

本文言語English
ホスト出版物のタイトル6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference
編集者Ruslan Mitkov, Jong C. Park
出版社Asian Federation of Natural Language Processing
ページ171-179
ページ数9
ISBN(電子版)9784990734800
出版ステータスPublished - 2013
外部発表はい
イベント6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Nagoya, Japan
継続期間: 2013 10月 14 → …

出版物シリーズ

名前6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference

Conference

Conference6th International Joint Conference on Natural Language Processing, IJCNLP 2013
国/地域Japan
CityNagoya
Period13/10/14 → …

ASJC Scopus subject areas

  • 人工知能
  • ソフトウェア

フィンガープリント

「Chinese Word Segmentation by Mining Maximized Substrings」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル