TY - GEN
T1 - A high performance digital neural processor design by Network on Chip architecture
AU - Yiping, Dong
AU - Ce, Li
AU - Hui, Liu
AU - Takahiro, Watanabe
PY - 2011
Y1 - 2011
N2 - This paper describes a high performance neural processor by using a Network on Chip (NoC) architecture to solve the interconnection and performance problems in hardware neural networks. The proposed NoC-based neural processor is composed of 20 tiles in 45 2-D array, and each tile includes a Process Element (PE) and a packet switched router. In each PE, four neurons are implemented to achieve low communication load. The network is 2D torus topology, and it has a 32 G/s bandwidth and asynchronous clocking system. Our proposed neural processor is designed using 90-nm CMOS technology with one Poly and nine metals, and its performance is evaluated. As a result, it can achieve over 3.1 G Connection Per Second (CPS) of performance while power dissipation is 1.1317 W at 1.2 V supply-voltage and 25 mm2 chip area. Compared with the other existing hardware neural networks, the proposed processor can achieve low communication load and high performance, and it is reconfigurable and extendable.
AB - This paper describes a high performance neural processor by using a Network on Chip (NoC) architecture to solve the interconnection and performance problems in hardware neural networks. The proposed NoC-based neural processor is composed of 20 tiles in 45 2-D array, and each tile includes a Process Element (PE) and a packet switched router. In each PE, four neurons are implemented to achieve low communication load. The network is 2D torus topology, and it has a 32 G/s bandwidth and asynchronous clocking system. Our proposed neural processor is designed using 90-nm CMOS technology with one Poly and nine metals, and its performance is evaluated. As a result, it can achieve over 3.1 G Connection Per Second (CPS) of performance while power dissipation is 1.1317 W at 1.2 V supply-voltage and 25 mm2 chip area. Compared with the other existing hardware neural networks, the proposed processor can achieve low communication load and high performance, and it is reconfigurable and extendable.
UR - http://www.scopus.com/inward/record.url?scp=79959526454&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79959526454&partnerID=8YFLogxK
U2 - 10.1109/VDAT.2011.5783621
DO - 10.1109/VDAT.2011.5783621
M3 - Conference contribution
AN - SCOPUS:79959526454
SN - 9781424484997
T3 - Proceedings of 2011 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2011
SP - 243
EP - 246
BT - Proceedings of 2011 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2011
T2 - 2011 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2011
Y2 - 25 April 2011 through 28 April 2011
ER -