Clone or relative? Understanding the origins of similar Android apps

Yuta Ishii, Takuya Watanabe, Mitsuaki Akiyama, Tatsuya Mori

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

Since it is not hard to repackage an Android app, there are many cloned apps, which we call "clones" in this work. As previous studies have reported, clones are generated for bad purposes by malicious parties, e.g., adding malicious functions, injecting/replacing advertising modules, and piracy. Besides such clones, there are legitimate, similar apps, which we call "relatives" in this work. These relatives are not clones but are similar in nature; i.e., they are generated by the same app-building service or by the same developer using a same template. Given these observations, this paper aims to answer the following two research questions: (RQ1) How can we distinguish between clones and relatives? (RQ2) What is the breakdown of clones and relatives in the official and third-party marketplaces? To answer the first research question, we developed a scalable framework called APPraiser that systematically extracts similar apps and classifies them into clones and relatives. We note that our key algorithms, which leverage sparseness of the data, have the time complexity of O(n) in practice. To answer the second research question, we applied the APPraiser framework to the over 1.3 millions of apps collected from official and third-party marketplaces. Our analysis revealed the following findings: In the official marketplace, 79% of similar apps were attributed to relatives while, in the third-party marketplace, 50% of similar apps were attributed to clones. The majority of relatives are apps developed by prolific developers in both marketplaces. We also found that in the third-party market, of the clones that were originally published in the official market, 76% of them are malware. To the best of our knowledge, this is the first work that clarified the breakdown of "similar" Android apps, and quantified their origins using a huge dataset equivalent to the size of official market.

Original languageEnglish
Title of host publicationIWSPA 2016 - Proceedings of the 2016 ACM International Workshop on Security and Privacy Analytics, co-located with CODASPY 2016
PublisherAssociation for Computing Machinery, Inc
Pages25-32
Number of pages8
ISBN (Electronic)9781450340779
DOIs
Publication statusPublished - 2016 Mar 11
Event2016 2nd ACM International Workshop on Security and Privacy Analytics, IWSPA 2016 - New Orleans, United States
Duration: 2016 Mar 11 → …

Publication series

NameIWSPA 2016 - Proceedings of the 2016 ACM International Workshop on Security and Privacy Analytics, co-located with CODASPY 2016

Other

Other2016 2nd ACM International Workshop on Security and Privacy Analytics, IWSPA 2016
Country/TerritoryUnited States
CityNew Orleans
Period16/3/11 → …

Keywords

  • Android
  • Large-scale data
  • Mobile security
  • Repackaging

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Clone or relative? Understanding the origins of similar Android apps'. Together they form a unique fingerprint.

Cite this