TY - JOUR
T1 - Overfitting measurement of convolutional neural networks using trained network weights
AU - Watanabe, Satoru
AU - Yamana, Hayato
N1 - Funding Information:
We are grateful to Hitachi, Ltd. for the tuition subsidy. The founder had no role in both study design and technical investigation in this paper.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Nature Switzerland AG.
PY - 2022/9
Y1 - 2022/9
N2 - Overfitting reduces the generalizability of convolutional neural networks (CNNs). Overfitting is generally detected by comparing the accuracies and losses of the training and validation data, where the validation data are formed from a portion of the training data; however, detection methods are ineffective for pretrained networks distributed without the training data. Thus, in this paper, we propose a method to detect overfitting of CNNs using the trained network weights inspired by the dropout technique. The dropout technique has been employed to prevent CNNs from overfitting, where the neurons in the CNNs are invalidated randomly during their training. It has been hypothesized that this technique prevents CNNs from overfitting by restraining the co-adaptations among neurons, and this hypothesis implies that the overfitting of CNNs results from co-adaptations among neurons and can be detected by investigating the inner representation of CNNs. The proposed persistent homology-based overfitting measure (PHOM) method constructs clique complexes in CNNs using the trained network weights, and the one-dimensional persistent homology investigates co-adaptations among neurons. In addition, we enhance PHOM to normalized PHOM (NPHOM) to mitigate fluctuation in PHOM caused by the difference in network structures. We applied the proposed methods to convolutional neural networks trained for the classification problems on the CIFAR-10, street view house number, Tiny ImageNet, and CIFAR-100 datasets. Experimental results demonstrate that PHOM and NPHOM can indicate the degree of overfitting of CNNs, which suggests that these methods enable us to filter overfitted CNNs without requiring the training data.
AB - Overfitting reduces the generalizability of convolutional neural networks (CNNs). Overfitting is generally detected by comparing the accuracies and losses of the training and validation data, where the validation data are formed from a portion of the training data; however, detection methods are ineffective for pretrained networks distributed without the training data. Thus, in this paper, we propose a method to detect overfitting of CNNs using the trained network weights inspired by the dropout technique. The dropout technique has been employed to prevent CNNs from overfitting, where the neurons in the CNNs are invalidated randomly during their training. It has been hypothesized that this technique prevents CNNs from overfitting by restraining the co-adaptations among neurons, and this hypothesis implies that the overfitting of CNNs results from co-adaptations among neurons and can be detected by investigating the inner representation of CNNs. The proposed persistent homology-based overfitting measure (PHOM) method constructs clique complexes in CNNs using the trained network weights, and the one-dimensional persistent homology investigates co-adaptations among neurons. In addition, we enhance PHOM to normalized PHOM (NPHOM) to mitigate fluctuation in PHOM caused by the difference in network structures. We applied the proposed methods to convolutional neural networks trained for the classification problems on the CIFAR-10, street view house number, Tiny ImageNet, and CIFAR-100 datasets. Experimental results demonstrate that PHOM and NPHOM can indicate the degree of overfitting of CNNs, which suggests that these methods enable us to filter overfitted CNNs without requiring the training data.
KW - Convolutional neural network
KW - Overfitting
KW - Persistent homology
KW - Topological data analysis
UR - http://www.scopus.com/inward/record.url?scp=85129815125&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129815125&partnerID=8YFLogxK
U2 - 10.1007/s41060-022-00332-1
DO - 10.1007/s41060-022-00332-1
M3 - Article
AN - SCOPUS:85129815125
SN - 2364-415X
VL - 14
SP - 261
EP - 278
JO - International Journal of Data Science and Analytics
JF - International Journal of Data Science and Analytics
IS - 3
ER -