TY - JOUR
T1 - Approximate DCT Design for Video Encoding Based on Novel Truncation Scheme
AU - Sun, Heming
AU - Cheng, Zhengxue
AU - Gharehbaghi, Amir Masoud
AU - Kimura, Shinji
AU - Fujita, Masahiro
N1 - Funding Information:
Manuscript received June 12, 2018; revised October 3, 2018, October 27, 2018, and November 7, 2018; accepted November 8, 2018. Date of publication December 4, 2018; date of current version March 15, 2019. This work was supported in part by Grants-in-Aid for Scientific Research from JSPS and a research fund from NEC. This paper was recommended by Associate Editor G. Masera. (Corresponding author: Heming Sun.) H. Sun is with the Waseda Research Institute for Science and Engineering, Tokyo 169-8555, Japan (e-mail: hemingsun@aoni.waseda.jp).
Funding Information:
This work was supported in part by Grants-in-Aid for Scientific Research from JSPS and a research fund from NEC.
Publisher Copyright:
© 2018 IEEE.
PY - 2019/4
Y1 - 2019/4
N2 - This paper presents an energy- and area-efficient architecture for approximated discrete cosine transform (DCT). Due to the good compression ability, DCT is widely exploited in signal processing. However, it is computationally intensive especially for large transform sizes. In this paper, we have reduced the computation cost of DCT by truncating a couple of least significant bits (LSB), most significant bits (MSB), and zero columns. First, considering that the contribution of LSBs is weakened because of the final right shift operation, we have eliminated the computation process for some LSBs. For the addition of the remaining LSBs, a parallel carry propagation adder is proposed to reduce the calculation latency. Second, owing to the phenomenon that high-frequency components are quite small in natural scenes, a couple of MSBs are selectively truncated according to their positions. Third, quantization is taken into account for the system-level optimization. The quantized results of all-zero columns are utilized to skip the column transforms afterward. The experimental results show that at most 32% area consumption and 60% power consumption can be reduced compared with the originally accurate DCT, while the compression efficiency loss caused by the DCT approximation is negligible for High Efficiency Video Coding.
AB - This paper presents an energy- and area-efficient architecture for approximated discrete cosine transform (DCT). Due to the good compression ability, DCT is widely exploited in signal processing. However, it is computationally intensive especially for large transform sizes. In this paper, we have reduced the computation cost of DCT by truncating a couple of least significant bits (LSB), most significant bits (MSB), and zero columns. First, considering that the contribution of LSBs is weakened because of the final right shift operation, we have eliminated the computation process for some LSBs. For the addition of the remaining LSBs, a parallel carry propagation adder is proposed to reduce the calculation latency. Second, owing to the phenomenon that high-frequency components are quite small in natural scenes, a couple of MSBs are selectively truncated according to their positions. Third, quantization is taken into account for the system-level optimization. The quantized results of all-zero columns are utilized to skip the column transforms afterward. The experimental results show that at most 32% area consumption and 60% power consumption can be reduced compared with the originally accurate DCT, while the compression efficiency loss caused by the DCT approximation is negligible for High Efficiency Video Coding.
KW - DCT
KW - HEVC
KW - VVC
KW - approximate computing
KW - truncation
UR - http://www.scopus.com/inward/record.url?scp=85058131307&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058131307&partnerID=8YFLogxK
U2 - 10.1109/TCSI.2018.2882474
DO - 10.1109/TCSI.2018.2882474
M3 - Article
AN - SCOPUS:85058131307
SN - 1549-8328
VL - 66
SP - 1517
EP - 1530
JO - IEEE Transactions on Circuits and Systems II: Express Briefs
JF - IEEE Transactions on Circuits and Systems II: Express Briefs
IS - 4
M1 - 8558684
ER -