Improving Syntactical Clone Detection Methods through the Use of an Intermediate Representation

Pedro M. Caldeira, Kazunori Sakamoto, Hironori Washizaki, Yoshiaki Fukazawa, Takahisa Shimada

研究成果: Conference contribution

9 被引用数 (Scopus)

抄録

Detection of type-3 and type-4 clones remains a difficult task. Current methods are complex, both on a conceptual and computational level. Similarly, their usage requires substantial implementation efforts. Instead of creating yet another method, it might be more productive to combine the simplicity of syntactic approaches with the abstractions granted by intermediate representations (IR). To this end, we devised a c-like IR based on LLVM and ran NiCad on it (LLNiCad). To establish whether the clone detection capabilities of syntactic approaches can be improved through an IR, we compared NiCad and LLNiCad on three open source projects taken from Krutz's benchmark and a subset of Google code jam solutions. In our results, the f1-score of LLNiCad consistently outperforms NiCad. Indeed, for all clone types in Krutz's benchmark, LLNiCad has a f1-score that is 37% higher than NiCad; with both better precision and recall. For type-4 clones in our GCJ benchmark, the f1-score of LLNiCad also outperforms CCCD (a semantic clone detector) by 44%. These findings suggest that IRs are beneficial for improving clone detection and that they have a larger impact on type-3 and type-4 clones.

本文言語English
ホスト出版物のタイトルIWSC 2020 - Proceedings of the 2020 IEEE 14th International Workshop on Software Clones
編集者Hitesh Sajnani, Chaiyong Ragkhitwetsagul
出版社Institute of Electrical and Electronics Engineers Inc.
ページ8-14
ページ数7
ISBN(電子版)9781728162690
DOI
出版ステータスPublished - 2020 2月
イベント14th IEEE International Workshop on Software Clones, IWSC 2020 - London, Canada
継続期間: 2020 2月 18 → …

出版物シリーズ

名前IWSC 2020 - Proceedings of the 2020 IEEE 14th International Workshop on Software Clones

Conference

Conference14th IEEE International Workshop on Software Clones, IWSC 2020
国/地域Canada
CityLondon
Period20/2/18 → …

ASJC Scopus subject areas

  • ソフトウェア
  • 安全性、リスク、信頼性、品質管理

フィンガープリント

「Improving Syntactical Clone Detection Methods through the Use of an Intermediate Representation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル