TY - GEN
T1 - Minimize exposure bias of seq2seq models in joint entity and relation extraction
AU - Zhang, Ranran Haoran
AU - Liu, Qianying
AU - Fan, Aysa Xuemo
AU - Ji, Heng
AU - Zeng, Daojian
AU - Cheng, Fei
AU - Kawahara, Daisuke
AU - Kurohashi, Sadao
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Joint entity and relation extraction aims to extract relation triplets from plain text directly. Prior work leverages Sequence-to-Sequence (Seq2Seq) models for triplet sequence generation. However, Seq2Seq enforces an unnecessary order on the unordered triplets and involves a large decoding length associated with error accumulation. These methods introduce exposure bias, which may cause the models overfit to the frequent label combination, thus limiting the generalization ability. We propose a novel Sequence-to-UnorderedMulti-Tree (Seq2UMTree) model to minimize the effects of exposure bias by limiting the decoding length to three within a triplet and removing the order among triplets. We evaluate our model on two datasets, DuIE and NYT, and systematically study how exposure bias alters the performance of Seq2Seq models. Experiments show that the state-of-the-art Seq2Seq model overfits to both datasets while Seq2UMTree shows significantly better generalization. Our code is available at https://github.com/WindChimeRan/OpenJERE.
AB - Joint entity and relation extraction aims to extract relation triplets from plain text directly. Prior work leverages Sequence-to-Sequence (Seq2Seq) models for triplet sequence generation. However, Seq2Seq enforces an unnecessary order on the unordered triplets and involves a large decoding length associated with error accumulation. These methods introduce exposure bias, which may cause the models overfit to the frequent label combination, thus limiting the generalization ability. We propose a novel Sequence-to-UnorderedMulti-Tree (Seq2UMTree) model to minimize the effects of exposure bias by limiting the decoding length to three within a triplet and removing the order among triplets. We evaluate our model on two datasets, DuIE and NYT, and systematically study how exposure bias alters the performance of Seq2Seq models. Experiments show that the state-of-the-art Seq2Seq model overfits to both datasets while Seq2UMTree shows significantly better generalization. Our code is available at https://github.com/WindChimeRan/OpenJERE.
UR - http://www.scopus.com/inward/record.url?scp=85109526851&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85109526851&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85109526851
T3 - Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020
SP - 236
EP - 246
BT - Findings of the Association for Computational Linguistics Findings of ACL
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
Y2 - 16 November 2020 through 20 November 2020
ER -