ABCD: Analogy-Based Controllable Data Augmentation

Shuo Yang*, Yves Lepage

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose an analogy-based data augmentation approach for sentiment and style transfer named Analogy-Based Controllable Data Augmentation (ABCD). The object of data augmentation is to expand the number of sentences based on a limited amount of available data. We are given two unpaired corpora with different styles. In data augmentation, we retain the original text style while changing words to generate new sentences. We first train a self-attention-based convolutional neural network to compute the distribution of the contribution of each word to style in a given sentence. We call the words with high style contribution style-characteristic words. By substituting content words and style-characteristic words separately, we generate two new sentences. We use an analogy between the original sentence and these two additional sentences to generate another sentence. The results show that our proposed approach decrease perplexity by about 4 points and outperforms baselines on three transfer datasets.

Original languageEnglish
Title of host publicationTheory and Practice of Natural Computing - 10th International Conference, TPNC 2021, Proceedings
EditorsClaus Aranha, Carlos Martín-Vide, Miguel A. Vega-Rodríguez
PublisherSpringer Science and Business Media Deutschland GmbH
Pages69-81
Number of pages13
ISBN (Print)9783030904241
DOIs
Publication statusPublished - 2021
Event10th International Conference on Theory and Practice of Natural Computing, TPNC 2021 - Virtual, Online
Duration: 2021 Dec 72021 Dec 10

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13082 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th International Conference on Theory and Practice of Natural Computing, TPNC 2021
CityVirtual, Online
Period21/12/721/12/10

Keywords

  • Affective computing
  • Computing with words
  • Natural language processing
  • Neural networks

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'ABCD: Analogy-Based Controllable Data Augmentation'. Together they form a unique fingerprint.

Cite this