TY - JOUR
T1 - Change-Point Detection in a Sequence of Bags-of-Data
AU - Koshijima, Kensuke
AU - Hino, Hideitsu
AU - Murata, Noboru
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2015/10/1
Y1 - 2015/10/1
N2 - In this paper, the limitation that is prominent in most existing works of change-point detection methods is addressed by proposing a nonparametric, computationally efficient method. The limitation is that most works assume that each data point observed at each time step is a single multi-dimensional vector. However, there are many situations where this does not hold. Therefore, a setting where each observation is a collection of random variables, which we call a bag of data, is considered. After estimating the underlying distribution behind each bag of data and embedding those distributions in a metric space, the change-point score is derived by evaluating how the sequence of distributions is fluctuating in the metric space using a distance-based information estimator. Also, a procedure that adaptively determines when to raise alerts is incorporated by calculating the confidence interval of the change-point score at each time step. This avoids raising false alarms in highly noisy situations and enables detecting changes of various magnitudes. A number of experimental studies and numerical examples are provided to demonstrate the generality and the effectiveness of our approach with both synthetic and real datasets.
AB - In this paper, the limitation that is prominent in most existing works of change-point detection methods is addressed by proposing a nonparametric, computationally efficient method. The limitation is that most works assume that each data point observed at each time step is a single multi-dimensional vector. However, there are many situations where this does not hold. Therefore, a setting where each observation is a collection of random variables, which we call a bag of data, is considered. After estimating the underlying distribution behind each bag of data and embedding those distributions in a metric space, the change-point score is derived by evaluating how the sequence of distributions is fluctuating in the metric space using a distance-based information estimator. Also, a procedure that adaptively determines when to raise alerts is incorporated by calculating the confidence interval of the change-point score at each time step. This avoids raising false alarms in highly noisy situations and enables detecting changes of various magnitudes. A number of experimental studies and numerical examples are provided to demonstrate the generality and the effectiveness of our approach with both synthetic and real datasets.
KW - Change-point detection
KW - Earth Movers Distance
KW - anomaly detection
KW - entropy estimator
UR - http://www.scopus.com/inward/record.url?scp=84941585413&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84941585413&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2015.2426693
DO - 10.1109/TKDE.2015.2426693
M3 - Article
AN - SCOPUS:84941585413
SN - 1041-4347
VL - 27
SP - 2632
EP - 2644
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 10
M1 - 7095580
ER -