TY - GEN
T1 - Controlling contents in data-to-document generation with human-designed topic labels
AU - Aoki, Kasumi
AU - Miyazawa, Akira
AU - Ishigaki, Tatsuya
AU - Aoki, Tatsuya
AU - Noji, Hiroshi
AU - Goshima, Keiichi
AU - Kobayashi, Ichiro
AU - Takamura, Hiroya
AU - Miyao, Yusuke
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2019
Y1 - 2019
N2 - We propose a data-to-document generator that can easily control the contents of output texts based on a neural language model. Conventional data-to-text model is useful when a reader seeks a global summary of data because it has only to describe an important part that has been extracted beforehand. However, since it differs from users to users what they are interested in, it is necessary to develop a method to generate various summaries according to users’ requests. We develop a model to generate various summaries and to control their contents by providing the explicit targets for a reference to the model as controllable factors. In the experiments, we used five-minute or one-hour charts of 9 indicators (e.g., Nikkei 225), as time-series data, and daily summaries of Nikkei Quick News as textual data. We conducted comparative experiments using two pieces of information: human-designed topic labels indicating the contents of a sentence and automatically extracted keywords as the referential information for generation. Experiments show both models using additional information of target document achieved higher performance in terms of BLEU and human evaluation. We found that human-designed topic labels are superior to extracted keywords in terms of controllability.
AB - We propose a data-to-document generator that can easily control the contents of output texts based on a neural language model. Conventional data-to-text model is useful when a reader seeks a global summary of data because it has only to describe an important part that has been extracted beforehand. However, since it differs from users to users what they are interested in, it is necessary to develop a method to generate various summaries according to users’ requests. We develop a model to generate various summaries and to control their contents by providing the explicit targets for a reference to the model as controllable factors. In the experiments, we used five-minute or one-hour charts of 9 indicators (e.g., Nikkei 225), as time-series data, and daily summaries of Nikkei Quick News as textual data. We conducted comparative experiments using two pieces of information: human-designed topic labels indicating the contents of a sentence and automatically extracted keywords as the referential information for generation. Experiments show both models using additional information of target document achieved higher performance in terms of BLEU and human evaluation. We found that human-designed topic labels are superior to extracted keywords in terms of controllability.
UR - http://www.scopus.com/inward/record.url?scp=85087142657&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087142657&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85087142657
T3 - INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference
SP - 323
EP - 332
BT - INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 12th International Conference on Natural Language Generation, INLG 2019
Y2 - 29 October 2019 through 1 November 2019
ER -