Improving crowdsourcing-based annotation of Japanese discourse relations

Yudai Kishimoto, Shinnosuke Sawada, Yugo Murawaki, Daisuke Kawahara, Sadao Kurohashi

研究成果: Conference contribution

7 被引用数 (Scopus)

抄録

Although discourse parsing is an important and fundamental task in natural language processing, few languages have corpora annotated with discourse relations and if any, they are small in size. Creating a new corpus of discourse relations by hand is costly and time-consuming. To cope with this problem, Kawahara et al. (2014) constructed a Japanese corpus with discourse annotations through crowdsourcing. However, they did not evaluate the quality of the annotation. In this paper, we evaluate the quality of the annotation using expert annotations. We find out that crowdsourcing-based annotation still leaves much room for improvement. Based on the error analysis, we propose improvement techniques based on language tests. We re-annotated the corpus with discourse annotations using the improvement techniques, and achieved approximately 3% improvement in F-measure. We will make re-annotated data publicly available.

本文言語English
ホスト出版物のタイトルLREC 2018 - 11th International Conference on Language Resources and Evaluation
編集者Hitoshi Isahara, Bente Maegaard, Stelios Piperidis, Christopher Cieri, Thierry Declerck, Koiti Hasida, Helene Mazo, Khalid Choukri, Sara Goggi, Joseph Mariani, Asuncion Moreno, Nicoletta Calzolari, Jan Odijk, Takenobu Tokunaga
出版社European Language Resources Association (ELRA)
ページ4044-4048
ページ数5
ISBN(電子版)9791095546009
出版ステータスPublished - 2019
外部発表はい
イベント11th International Conference on Language Resources and Evaluation, LREC 2018 - Miyazaki, Japan
継続期間: 2018 5月 72018 5月 12

出版物シリーズ

名前LREC 2018 - 11th International Conference on Language Resources and Evaluation

Other

Other11th International Conference on Language Resources and Evaluation, LREC 2018
国/地域Japan
CityMiyazaki
Period18/5/718/5/12

ASJC Scopus subject areas

  • 言語学および言語
  • 教育
  • 図書館情報学
  • 言語および言語学

フィンガープリント

「Improving crowdsourcing-based annotation of Japanese discourse relations」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル