TY - GEN
T1 - Theoretical Analysis of the Advantage of Deepening Neural Networks
AU - Esaki, Yasushi
AU - Nakahara, Yuta
AU - Matsushima, Toshiyasu
N1 - Funding Information:
ACKNOWLEDGMENT This research was supported by JSPS Grants-in-Aid for Scientific Research JP17K00316, JP17K06446, JP18K11585, JP19K04914.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - We propose two new criteria to understand the advantage of deepening neural networks. It is important to know the expressivity of functions computable by deep neural networks in order to understand the advantage of deepening neural networks. Unless deep neural networks have enough expressivity, they cannot have good performance even though learning is successful. In this situation, the proposed criteria contribute to understanding the advantage of deepening neural networks since they can evaluate the expressivity independently from the efficiency of learning. The first criterion shows the approximation accuracy of deep neural networks to the target function. This criterion has the background that the goal of deep learning is approximating the target function by deep neural networks. The second criterion shows the property of linear regions of functions computable by deep neural networks. This criterion has the background that deep neural networks whose activation functions are piecewise linear are also piecewise linear. Furthermore, by the two criteria, we show that to increase layers is more effective than to increase units at each layer on improving the expressivity of deep neural networks.
AB - We propose two new criteria to understand the advantage of deepening neural networks. It is important to know the expressivity of functions computable by deep neural networks in order to understand the advantage of deepening neural networks. Unless deep neural networks have enough expressivity, they cannot have good performance even though learning is successful. In this situation, the proposed criteria contribute to understanding the advantage of deepening neural networks since they can evaluate the expressivity independently from the efficiency of learning. The first criterion shows the approximation accuracy of deep neural networks to the target function. This criterion has the background that the goal of deep learning is approximating the target function by deep neural networks. The second criterion shows the property of linear regions of functions computable by deep neural networks. This criterion has the background that deep neural networks whose activation functions are piecewise linear are also piecewise linear. Furthermore, by the two criteria, we show that to increase layers is more effective than to increase units at each layer on improving the expressivity of deep neural networks.
KW - approximation accuracy
KW - deep learning theory
KW - expressivity
KW - linear region
UR - http://www.scopus.com/inward/record.url?scp=85102517006&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102517006&partnerID=8YFLogxK
U2 - 10.1109/ICMLA51294.2020.00081
DO - 10.1109/ICMLA51294.2020.00081
M3 - Conference contribution
AN - SCOPUS:85102517006
T3 - Proceedings - 19th IEEE International Conference on Machine Learning and Applications, ICMLA 2020
SP - 479
EP - 484
BT - Proceedings - 19th IEEE International Conference on Machine Learning and Applications, ICMLA 2020
A2 - Wani, M. Arif
A2 - Luo, Feng
A2 - Li, Xiaolin
A2 - Dou, Dejing
A2 - Bonchi, Francesco
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE International Conference on Machine Learning and Applications, ICMLA 2020
Y2 - 14 December 2020 through 17 December 2020
ER -