Abstract
The Mahalanobis-Taguchi (MT) method is a standard method of multivariate analysis for detecting anomalies or recognizing patterns. A number of case studies using the MT method have been reported. However, good performance is only obtained when a sufficient number of samples can be ensured; if the number of samples is insufficient, this method has a large probability bias. In this paper, we first analyze the existing measures of methods, in which performing dimension reduction, such as using variable selection, is common, and show that there are some problems with testing for unknown data. Secondly, we propose two analytical procedures for small sample data in which the detection capability with respect to unknown data is taken into account. In these proposed procedures, when the number of data samples is small compared to the dimensions of the variables, the detection measure in the MT method is replaced by a measure derived through approximating correlation matrices based on probabilistic principal component analysis (PPCA) or by introducing ensemble learning. Finally, based on raw data analysis using the KDDCup99 dataset and simulation results, we consider how the proposed procedures should be applied when multicollinearity occurs and which of these two procedures should be applied according to the data pattern.
Original language | English |
---|---|
Pages (from-to) | 30-38 |
Number of pages | 9 |
Journal | Journal of Japan Industrial Management Association |
Volume | 66 |
Issue number | 1 |
Publication status | Published - 2015 |
Keywords
- Ensemble learning
- MT method
- Probabilistic principal component analysis
- Taguchi method
ASJC Scopus subject areas
- Industrial and Manufacturing Engineering
- Applied Mathematics
- Management Science and Operations Research
- Strategy and Management