Adaptive natural gradient descent (ANGD) method realizes natural gradient descent (NGD) without needing to know the input distribution of learning data and reduces the calculation cost from a cubic order to a square order. However, no performance analysis of ANGD has been done. We have developed a statistical-mechanical theory of the simplified version of ANGD dynamics for soft committee machines in on-line learning; this method provides deterministic learning dynamics expressed through a few order parameters, even though ANGD intrinsically holds a large approximated Fisher information matrix. Numerical results obtained using this theory were consistent with those of a simulation, with respect not only to the learning curve but also to the learning failure. Utilizing this method, we numerically evaluated ANGD efficiency and found that ANGD generally performs as well as NGD. We also revealed the key condition affecting the learning plateau in ANGD.
|Number of pages
|Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics
|Published - 2004 Jan 1
ASJC Scopus subject areas
- Statistical and Nonlinear Physics
- Statistics and Probability
- Condensed Matter Physics