TY - GEN
T1 - Clone or relative?
T2 - 2016 2nd ACM International Workshop on Security and Privacy Analytics, IWSPA 2016
AU - Ishii, Yuta
AU - Watanabe, Takuya
AU - Akiyama, Mitsuaki
AU - Mori, Tatsuya
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/3/11
Y1 - 2016/3/11
N2 - Since it is not hard to repackage an Android app, there are many cloned apps, which we call "clones" in this work. As previous studies have reported, clones are generated for bad purposes by malicious parties, e.g., adding malicious functions, injecting/replacing advertising modules, and piracy. Besides such clones, there are legitimate, similar apps, which we call "relatives" in this work. These relatives are not clones but are similar in nature; i.e., they are generated by the same app-building service or by the same developer using a same template. Given these observations, this paper aims to answer the following two research questions: (RQ1) How can we distinguish between clones and relatives? (RQ2) What is the breakdown of clones and relatives in the official and third-party marketplaces? To answer the first research question, we developed a scalable framework called APPraiser that systematically extracts similar apps and classifies them into clones and relatives. We note that our key algorithms, which leverage sparseness of the data, have the time complexity of O(n) in practice. To answer the second research question, we applied the APPraiser framework to the over 1.3 millions of apps collected from official and third-party marketplaces. Our analysis revealed the following findings: In the official marketplace, 79% of similar apps were attributed to relatives while, in the third-party marketplace, 50% of similar apps were attributed to clones. The majority of relatives are apps developed by prolific developers in both marketplaces. We also found that in the third-party market, of the clones that were originally published in the official market, 76% of them are malware. To the best of our knowledge, this is the first work that clarified the breakdown of "similar" Android apps, and quantified their origins using a huge dataset equivalent to the size of official market.
AB - Since it is not hard to repackage an Android app, there are many cloned apps, which we call "clones" in this work. As previous studies have reported, clones are generated for bad purposes by malicious parties, e.g., adding malicious functions, injecting/replacing advertising modules, and piracy. Besides such clones, there are legitimate, similar apps, which we call "relatives" in this work. These relatives are not clones but are similar in nature; i.e., they are generated by the same app-building service or by the same developer using a same template. Given these observations, this paper aims to answer the following two research questions: (RQ1) How can we distinguish between clones and relatives? (RQ2) What is the breakdown of clones and relatives in the official and third-party marketplaces? To answer the first research question, we developed a scalable framework called APPraiser that systematically extracts similar apps and classifies them into clones and relatives. We note that our key algorithms, which leverage sparseness of the data, have the time complexity of O(n) in practice. To answer the second research question, we applied the APPraiser framework to the over 1.3 millions of apps collected from official and third-party marketplaces. Our analysis revealed the following findings: In the official marketplace, 79% of similar apps were attributed to relatives while, in the third-party marketplace, 50% of similar apps were attributed to clones. The majority of relatives are apps developed by prolific developers in both marketplaces. We also found that in the third-party market, of the clones that were originally published in the official market, 76% of them are malware. To the best of our knowledge, this is the first work that clarified the breakdown of "similar" Android apps, and quantified their origins using a huge dataset equivalent to the size of official market.
KW - Android
KW - Large-scale data
KW - Mobile security
KW - Repackaging
UR - http://www.scopus.com/inward/record.url?scp=84966621847&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84966621847&partnerID=8YFLogxK
U2 - 10.1145/2875475.2875480
DO - 10.1145/2875475.2875480
M3 - Conference contribution
AN - SCOPUS:84966621847
T3 - IWSPA 2016 - Proceedings of the 2016 ACM International Workshop on Security and Privacy Analytics, co-located with CODASPY 2016
SP - 25
EP - 32
BT - IWSPA 2016 - Proceedings of the 2016 ACM International Workshop on Security and Privacy Analytics, co-located with CODASPY 2016
PB - Association for Computing Machinery, Inc
Y2 - 11 March 2016
ER -