Semi-supervised discourse relation classification with structural learning

Hugo Hernault*, Danushka Bollegala, Mitsuru Ishizuka

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

The corpora available for training discourse relation classifiers are annotated using a general set of discourse relations. However, for certain applications, custom discourse relations are required. Creating a new annotated corpus with a new relation taxonomy is a time-consuming and costly process. We address this problem by proposing a semi-supervised approach to discourse relation classification based on Structural Learning. First, we solve a set of auxiliary classification problems using unlabeled data. Second, the learned classifiers are used to extend feature vectors to train a discourse relation classifier. By defining a relevant set of auxiliary classification problems, we show that the proposed method brings improvement of at least 50% in accuracy and F-score on the RST Discourse Treebank and Penn Discourse Treebank, when small training sets of ca. 1000 training instances are employed. This is an attractive perspective for training discourse relation classifiers on domains where little amount of labeled training data is available.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages340-352
Number of pages13
Volume6608 LNCS
EditionPART 1
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011 - Tokyo
Duration: 2011 Feb 202011 Feb 26

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6608 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011
CityTokyo
Period11/2/2011/2/26

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Semi-supervised discourse relation classification with structural learning'. Together they form a unique fingerprint.

Cite this