TY - GEN
T1 - WUY at SemEval-2020 Task 7
T2 - 14th International Workshops on Semantic Evaluation, SemEval 2020
AU - Zhang, Cheng
AU - Yamana, Hayato
N1 - Funding Information:
The authors would like to thank the reviewers for their insightful comments and constructive suggestions on improving this work.
Publisher Copyright:
© 2020 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings. All rights reserved.
PY - 2020
Y1 - 2020
N2 - This paper describes our participation in SemEval 2020 Task 7 on assessment of humor in edited news headlines, which includes two subtasks, estimating the humor of micro-editd news headlines (subtask A) and predicting the more humorous of the two edited headlines (subtask B). To address these tasks, we propose two systems. The first system adopts a regression-based fine-tuned single-sequence bidirectional encoder representations from transformers (BERT) model with easy data augmentation (EDA), called “BERT+EDA”. The second system adopts a hybrid of a regression-based fine-tuned sequence-pair BERT model and a combined Naive Bayes and support vector machine (SVM) model estimated on term frequency-inverse document frequency (TFIDF) features, called “BERT+NB-SVM”. In this case, no additional training datasets were used, and the BERT+NB-SVM model outperformed BERT+EDA. The official root-mean-square deviation (RMSE) score for subtask A is 0.57369 and ranks 31st out of 48, whereas the best RMSE of BERT+NB-SVM is 0.52429, ranking 7th. For subtask B, we simply use a sequence-pair BERT model, the official accuracy of which is 0.53196 and ranks 25th out of 32.
AB - This paper describes our participation in SemEval 2020 Task 7 on assessment of humor in edited news headlines, which includes two subtasks, estimating the humor of micro-editd news headlines (subtask A) and predicting the more humorous of the two edited headlines (subtask B). To address these tasks, we propose two systems. The first system adopts a regression-based fine-tuned single-sequence bidirectional encoder representations from transformers (BERT) model with easy data augmentation (EDA), called “BERT+EDA”. The second system adopts a hybrid of a regression-based fine-tuned sequence-pair BERT model and a combined Naive Bayes and support vector machine (SVM) model estimated on term frequency-inverse document frequency (TFIDF) features, called “BERT+NB-SVM”. In this case, no additional training datasets were used, and the BERT+NB-SVM model outperformed BERT+EDA. The official root-mean-square deviation (RMSE) score for subtask A is 0.57369 and ranks 31st out of 48, whereas the best RMSE of BERT+NB-SVM is 0.52429, ranking 7th. For subtask B, we simply use a sequence-pair BERT model, the official accuracy of which is 0.53196 and ranks 25th out of 32.
UR - http://www.scopus.com/inward/record.url?scp=85115668497&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115668497&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85115668497
T3 - 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings
SP - 1071
EP - 1076
BT - 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings
A2 - Herbelot, Aurelie
A2 - Zhu, Xiaodan
A2 - Palmer, Alexis
A2 - Schneider, Nathan
A2 - May, Jonathan
A2 - Shutova, Ekaterina
PB - International Committee for Computational Linguistics
Y2 - 12 December 2020 through 13 December 2020
ER -