Abstract
Neural networks are powerful tool to simulate nonlinear systems. However, obtaining reliable neural networks is usually a time-consuming task, which requires repeated training of the networks with the available data. Recently, some attempts to accelerate the neural network training by utilizing paralleled hardware have been proposed. One of the challenges in hardware acceleration is implementing the floating-point squashing functions, like sigmoid(x) and tanh(x), that have vast input domain. However, previous implementations of squashing functions either suffer from low speed and poor accuracy or require large area and lots of manual works. In this paper, we present an automatic method to implement the squashing functions. Based on the proposed domain partition algorithm and coefficient compression method, squashing functions with smaller size, faster speed, and higher precision are obtained. Experiment on sigmoid(x) shows that less memory usage, up to 20k times smaller error rate, 300 times synthesis speedup, and 50% reduction of LUTs and flop-flops usage are achieved than conventional method.
Original language | English |
---|---|
Title of host publication | International Conference on Solid-State and Integrated Circuits Technology Proceedings, ICSICT |
Pages | 2204-2207 |
Number of pages | 4 |
DOIs | |
Publication status | Published - 2008 |
Event | 2008 9th International Conference on Solid-State and Integrated-Circuit Technology, ICSICT 2008 - Beijing Duration: 2008 Oct 20 → 2008 Oct 23 |
Other
Other | 2008 9th International Conference on Solid-State and Integrated-Circuit Technology, ICSICT 2008 |
---|---|
City | Beijing |
Period | 08/10/20 → 08/10/23 |
ASJC Scopus subject areas
- Electrical and Electronic Engineering
- Condensed Matter Physics
- Electronic, Optical and Magnetic Materials