抄録
Neural networks are powerful tool to simulate nonlinear systems. However, obtaining reliable neural networks is usually a time-consuming task, which requires repeated training of the networks with the available data. Recently, some attempts to accelerate the neural network training by utilizing paralleled hardware have been proposed. One of the challenges in hardware acceleration is implementing the floating-point squashing functions, like sigmoid(x) and tanh(x), that have vast input domain. However, previous implementations of squashing functions either suffer from low speed and poor accuracy or require large area and lots of manual works. In this paper, we present an automatic method to implement the squashing functions. Based on the proposed domain partition algorithm and coefficient compression method, squashing functions with smaller size, faster speed, and higher precision are obtained. Experiment on sigmoid(x) shows that less memory usage, up to 20k times smaller error rate, 300 times synthesis speedup, and 50% reduction of LUTs and flop-flops usage are achieved than conventional method.
本文言語 | English |
---|---|
ホスト出版物のタイトル | International Conference on Solid-State and Integrated Circuits Technology Proceedings, ICSICT |
ページ | 2204-2207 |
ページ数 | 4 |
DOI | |
出版ステータス | Published - 2008 |
イベント | 2008 9th International Conference on Solid-State and Integrated-Circuit Technology, ICSICT 2008 - Beijing 継続期間: 2008 10月 20 → 2008 10月 23 |
Other
Other | 2008 9th International Conference on Solid-State and Integrated-Circuit Technology, ICSICT 2008 |
---|---|
City | Beijing |
Period | 08/10/20 → 08/10/23 |
ASJC Scopus subject areas
- 電子工学および電気工学
- 凝縮系物理学
- 電子材料、光学材料、および磁性材料