TY - GEN
T1 - Spatio-temporal factorization of log data for understanding network events
AU - Kimura, Tatsuaki
AU - Ishibashi, Keisuke
AU - Mori, Tatsuya
AU - Sawada, Hiroshi
AU - Toyono, Tsuyoshi
AU - Nishimatsu, Ken
AU - Watanabe, Akio
AU - Shimoda, Akihiro
AU - Shiomoto, Kohei
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - Understanding the impacts and patterns of network events such as link flaps or hardware errors is crucial for diagnosing network anomalies. In large production networks, analyzing the log messages that record network events has become a challenging task due to the following two reasons. First, the log messages are composed of unstructured text messages generated by vendor-specific rules. Second, network equipment such as routers, switches, and RADIUS severs generate various log messages induced by network events that span across several geographical locations, network layers, protocols, and services. In this paper, we have tackled these obstacles by building two novel techniques: statistical template extraction (STE) and log tensor factorization (LTF). STE leverages a statistical clustering technique to automatically extract primary templates from unstructured log messages. LTF aims to build a statistical model that captures spatial-temporal patterns of log messages. Such spatial-temporal patterns provide useful insights into understanding the impacts and root cause of hidden network events. This paper first formulates our problem in a mathematical way. We then validate our techniques using massive amount of network log messages collected from a large operating network. We also demonstrate several case studies that validate the usefulness of our technique.
AB - Understanding the impacts and patterns of network events such as link flaps or hardware errors is crucial for diagnosing network anomalies. In large production networks, analyzing the log messages that record network events has become a challenging task due to the following two reasons. First, the log messages are composed of unstructured text messages generated by vendor-specific rules. Second, network equipment such as routers, switches, and RADIUS severs generate various log messages induced by network events that span across several geographical locations, network layers, protocols, and services. In this paper, we have tackled these obstacles by building two novel techniques: statistical template extraction (STE) and log tensor factorization (LTF). STE leverages a statistical clustering technique to automatically extract primary templates from unstructured log messages. LTF aims to build a statistical model that captures spatial-temporal patterns of log messages. Such spatial-temporal patterns provide useful insights into understanding the impacts and root cause of hidden network events. This paper first formulates our problem in a mathematical way. We then validate our techniques using massive amount of network log messages collected from a large operating network. We also demonstrate several case studies that validate the usefulness of our technique.
UR - http://www.scopus.com/inward/record.url?scp=84904438214&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84904438214&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM.2014.6847986
DO - 10.1109/INFOCOM.2014.6847986
M3 - Conference contribution
AN - SCOPUS:84904438214
SN - 9781479933600
T3 - Proceedings - IEEE INFOCOM
SP - 610
EP - 618
BT - IEEE INFOCOM 2014 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 33rd IEEE Conference on Computer Communications, IEEE INFOCOM 2014
Y2 - 27 April 2014 through 2 May 2014
ER -