TY - GEN
T1 - Unsupervised ensemble anomaly detection through time-periodical packet sampling
AU - Nawata, Shuichi
AU - Uchida, Masato
AU - Gu, Yu
AU - Tsuru, Masato
AU - Oie, Yuji
PY - 2010/6/29
Y1 - 2010/6/29
N2 - We propose an anomaly detection method that trains a baseline model describing the normal behavior of network traffic without using manually labeled traffic data. The trained baseline model is used as the basis for comparison with the audit network traffic. The proposed method can be carried out in an unsupervised manner through the use of time-periodical packet sampling for a different purpose from which it was intended. That is, we take advantage of the lossy nature of packet sampling for the purpose of extracting normal packets from the unlabeled original traffic data. By using real traffic traces, we show that the proposed method is comparable in terms of false positive and false negative rates on detecting anomalies regarding TCP SYN packets to the conventional method that requires manually labeled traffic data to train the baseline model. In addition, in order to mitigate the possible performance variation due to probabilistic nature of sampled traffic data, we devise an ensemble anomaly detection method that exploits multiple baseline models in parallel. Experimental results show that the proposed ensemble anomaly detection performs well and is not affected by the variability of time-periodical packet sampling.
AB - We propose an anomaly detection method that trains a baseline model describing the normal behavior of network traffic without using manually labeled traffic data. The trained baseline model is used as the basis for comparison with the audit network traffic. The proposed method can be carried out in an unsupervised manner through the use of time-periodical packet sampling for a different purpose from which it was intended. That is, we take advantage of the lossy nature of packet sampling for the purpose of extracting normal packets from the unlabeled original traffic data. By using real traffic traces, we show that the proposed method is comparable in terms of false positive and false negative rates on detecting anomalies regarding TCP SYN packets to the conventional method that requires manually labeled traffic data to train the baseline model. In addition, in order to mitigate the possible performance variation due to probabilistic nature of sampled traffic data, we devise an ensemble anomaly detection method that exploits multiple baseline models in parallel. Experimental results show that the proposed ensemble anomaly detection performs well and is not affected by the variability of time-periodical packet sampling.
UR - http://www.scopus.com/inward/record.url?scp=77953880481&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77953880481&partnerID=8YFLogxK
U2 - 10.1109/INFCOMW.2010.5466662
DO - 10.1109/INFCOMW.2010.5466662
M3 - Conference contribution
AN - SCOPUS:77953880481
SN - 9781424467396
T3 - Proceedings - IEEE INFOCOM
BT - INFOCOM 2010 - IEEE Conference on Computer Communications Workshops
T2 - IEEE Conference on Computer Communications Workshops, INFOCOM 2010
Y2 - 15 March 2010 through 19 March 2010
ER -