抄録
Mining high dimensional outliers is not fully resolved for its dimensional particularity. The existing full space based methods can find distinct outliers and neglect those hidden in some subspaces. Subspace based approaches can detect most outliers that are apparent in low dimensional spaces, while missing the invisible outliers in subspaces. This paper proposes a novel two-phase inspection model. The first phase measures neighbor's density in subspaces to find low dimensional outliers. The second phase evaluates deviation degree of neighbors in connected subspaces. The undiscovered outliers appear a fast dispersion and scatter more than its neighbors. We analysis two-phase results statistically, and merge into one score for each object. The outliers are expressed with top score objects. The evaluation on synthetic and real data sets shows that our proposal outperform state of the art algorithms in high dimensional outlier issue.
本文言語 | English |
---|---|
ホスト出版物のタイトル | International Conference on Information and Knowledge Management, Proceedings |
出版社 | Association for Computing Machinery |
ページ | 57-62 |
ページ数 | 6 |
巻 | 2014-November |
版 | November |
DOI | |
出版ステータス | Published - 2014 11月 3 |
イベント | 7th PhD Workshop in Information and Knowledge Management, PIKM 2014, in Conjunction with the ACM CIKM 2014 Conference - Shanghai, China 継続期間: 2014 11月 3 → … |
Other
Other | 7th PhD Workshop in Information and Knowledge Management, PIKM 2014, in Conjunction with the ACM CIKM 2014 Conference |
---|---|
国/地域 | China |
City | Shanghai |
Period | 14/11/3 → … |
ASJC Scopus subject areas
- ビジネス、管理および会計(全般)
- 決定科学(全般)