TY - GEN
T1 - Duplicate Bug Report Detection by Using Sentence Embedding and Fine-tuning
AU - Isotani, Haruna
AU - Washizaki, Hironori
AU - Fukazawa, Yoshiaki
AU - Nomoto, Tsutomu
AU - Ouji, Saori
AU - Saito, Shinobu
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Industrial software maintenance devotes much time and effort to find duplicate bug reports. In this paper, we propose an automated duplicate bug report detection system to improve software maintenance efficiency. Our system detects duplicate reports by vectorizing the contents of each report item by deep-learning-based sentence embedding and calculating the similarity of the whole report from those of the item vectors. The Sentence-BERT fine-tuned with report texts is used for sentence embedding. Finally, we verify that the combination of processing separately by item and Sentence-BERT fine-tuned with reports effectively detects duplicate bug reports in industrial experiments that compare the performance of existing methods.
AB - Industrial software maintenance devotes much time and effort to find duplicate bug reports. In this paper, we propose an automated duplicate bug report detection system to improve software maintenance efficiency. Our system detects duplicate reports by vectorizing the contents of each report item by deep-learning-based sentence embedding and calculating the similarity of the whole report from those of the item vectors. The Sentence-BERT fine-tuned with report texts is used for sentence embedding. Finally, we verify that the combination of processing separately by item and Sentence-BERT fine-tuned with reports effectively detects duplicate bug reports in industrial experiments that compare the performance of existing methods.
KW - BERT
KW - Bug reports
KW - duplicate detection
KW - information retrieval
KW - natural language processing
KW - sentence embedding
UR - http://www.scopus.com/inward/record.url?scp=85123377197&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123377197&partnerID=8YFLogxK
U2 - 10.1109/ICSME52107.2021.00054
DO - 10.1109/ICSME52107.2021.00054
M3 - Conference contribution
AN - SCOPUS:85123377197
T3 - Proceedings - 2021 IEEE International Conference on Software Maintenance and Evolution, ICSME 2021
SP - 535
EP - 544
BT - Proceedings - 2021 IEEE International Conference on Software Maintenance and Evolution, ICSME 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 37th IEEE International Conference on Software Maintenance and Evolution, ICSME 2021
Y2 - 27 September 2021 through 1 October 2021
ER -