TY - GEN
T1 - An Area-Power-Efficient Multiplier-less Processing Element Design for CNN Accelerators
AU - Li, Jiaxiang
AU - Yanagisawa, Masao
AU - Shi, Youhua
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Machine learning has achieved remarkable success in various domains. However, the computational demands and memory requirements of these models pose challenges for deployment on privacy-secured or wearable edge devices. To address this issue, we propose an area-power-efficient multiplier-less processing element (PE) in this paper. Prior to implementing the proposed PE, we apply a power-of-2 dictionary-based quantization to the model. We analyze the effectiveness of this quantization method in preserving the accuracy of the original model and present the standard and a specialized diagram illustrating the schematics of the proposed PE. Our evaluation results demonstrate that our design achieves approximately 30% lower power consumption and 35% smaller core area compared to a conventional multiplication-and-accumulation (MAC) PE. Moreover, the applied quantization reduces the model size and operand bit-width, resulting in reduced on-chip memory usage and energy consumption for memory accesses.
AB - Machine learning has achieved remarkable success in various domains. However, the computational demands and memory requirements of these models pose challenges for deployment on privacy-secured or wearable edge devices. To address this issue, we propose an area-power-efficient multiplier-less processing element (PE) in this paper. Prior to implementing the proposed PE, we apply a power-of-2 dictionary-based quantization to the model. We analyze the effectiveness of this quantization method in preserving the accuracy of the original model and present the standard and a specialized diagram illustrating the schematics of the proposed PE. Our evaluation results demonstrate that our design achieves approximately 30% lower power consumption and 35% smaller core area compared to a conventional multiplication-and-accumulation (MAC) PE. Moreover, the applied quantization reduces the model size and operand bit-width, resulting in reduced on-chip memory usage and energy consumption for memory accesses.
KW - a multiplier-less processing element
KW - area-efficient
KW - energy-efficient
KW - machine learning model quantization
UR - http://www.scopus.com/inward/record.url?scp=85184580528&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184580528&partnerID=8YFLogxK
U2 - 10.1109/ASICON58565.2023.10396536
DO - 10.1109/ASICON58565.2023.10396536
M3 - Conference contribution
AN - SCOPUS:85184580528
T3 - Proceedings of International Conference on ASIC
BT - Proceedings of 2023 IEEE 15th International Conference on ASIC, ASICON 2023
A2 - Ye, Fan
A2 - Tang, Ting-Ao
PB - IEEE Computer Society
T2 - 15th IEEE International Conference on ASIC, ASICON 2023
Y2 - 24 October 2023 through 27 October 2023
ER -