Extraction of lexical bundles used in natural language processing articles

Chooi Ling Goh, Yves Lepage

研究成果: Conference contribution

抄録

Lexical bundles are indispensable for fluent academic writing. They might not constitute complete structural units but they occur very frequently in academic conversations, conference presentations and scientific articles. This paper shows how to collect a large database of lexical bundles from articles in the Natural Language Processing (NLP) domain. We first collect highly frequent N-grams from the ACL-ARC collection of NLP articles and then classify them into true or false lexical bundles using machine learning models trained from a set of manually checked bundles. In a verification experiment, our best model achieves an accuracy of 76 %. Using this model, we extract more than 18,000 lexical bundles from the ACL-ARC corpus, which we publicly release.

本文言語English
ホスト出版物のタイトル2019 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2019
出版社Institute of Electrical and Electronics Engineers Inc.
ページ223-228
ページ数6
ISBN(電子版)9781728152929
DOI
出版ステータスPublished - 2019 10月
イベント11th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2019 - Bali, Indonesia
継続期間: 2019 10月 122019 10月 13

出版物シリーズ

名前2019 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2019

Conference

Conference11th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2019
国/地域Indonesia
CityBali
Period19/10/1219/10/13

ASJC Scopus subject areas

  • 人工知能
  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用
  • コンピュータ ビジョンおよびパターン認識
  • 情報システム
  • 健康情報学
  • 教育
  • 通信

フィンガープリント

「Extraction of lexical bundles used in natural language processing articles」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル