Statistical estimation of the names of HTTPS servers with domain name graphs

Tatsuya Mori*, Takeru Inoue, Akihiro Shimoda, Kazumichi Sato, Shigeaki Harada, Keisuke Ishibashi, Shigeki Goto

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Adoption of SSL/TLS to protect the privacy of web users has become increasingly common. In fact, as of September 2015, more than 68% of top-1M websites deploy SSL/TLS to encrypt their traffic. The transition from HTTP to HTTPS has brought a new challenge for network operators who need to understand the hostnames of encrypted web traffic for various reasons. To meet the challenge, this work develops a novel framework called SFMap, which estimates names of HTTPS servers by analyzing precedent DNS queries/responses in a statistical way. The SFMap framework introduces domain name graph, which can characterize highly dynamic and diverse nature of DNS mechanisms. Such complexity arises from the recent deployment and implementation of DNS ecosystems; i.e., canonical name tricks used by CDNs, the dynamic and diverse nature of DNS TTL settings, and incomplete and unpredictable measurements due to the existence of various DNS caching instances. First, we demonstrate that SFMap establishes good estimation accuracies and outperforms a state-of-the-art approach. We also aim to identify the optimized setting of the SFMap framework. Next, based on the preliminary analysis, we introduce techniques to make the SFMap framework scalable to large-scale traffic data. We validate the effectiveness of the approach using large-scale Internet traffic.

Original languageEnglish
Pages (from-to)104-113
Number of pages10
JournalComputer Communications
Volume94
DOIs
Publication statusPublished - 2016

Keywords

  • DNS
  • Graph
  • SSL/TLS
  • Traffic analysis

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Statistical estimation of the names of HTTPS servers with domain name graphs'. Together they form a unique fingerprint.

Cite this