TY - JOUR
T1 - Motion estimation optimization for H.264/AVC using source image edge features
AU - Liu, Zhenyu
AU - Zhou, Junwei
AU - Goto, Satoshi
AU - Ikenaga, Takeshi
N1 - Funding Information:
Manuscript received January 1, 2008; revised June 15, 2008, August 31, 2008, and December 15, 2008. First version published May 12, 2009; current version published August 14, 2009. This work was supported by CREST JST. This paper was recommended by Associate Editor G. Wen. Z. Liu is with RIIT of Tsinghua University, Beijing, 100084 China (e-mail: liuzhenyu73@tsinghua.edu.cn). J. Zhou is with the Sun Microsystems Incorporation, Santa Clara, CA 95054 USA (e-mail: junwei.zhou@sun.com). S. Goto, and T. Ikenaga are with the Graduate School of IPS, Waseda University, Tokyo, 808-0135 Japan (e-mail: goto@waseda.jp; ikenaga@waseda.jp). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSVT.2009.2022796
PY - 2009/8
Y1 - 2009/8
N2 - The H.264/AVC coding standard processes variable block size motion-compensated prediction with multiple reference frames to achieve a pronounced improvement in compression efficiency. Accordingly, the computation of motion estimation increases in proportion to the product of the number of reference frame and the number of intermode. The mathematical analysis in this paper illustrates that the motion-compensated prediction errors are mainly determined by the detailed textures in the source image. The image block being rich in textures contains numerous high-frequency signals, which make variable block size and multiple reference frame techniques essential. On the basis of rate-distortion theory, in this paper, the spatial homogeneity of an image block is made as a relative concept with respect to the current quantization step. For the homogenous block, its futile reference frames and intermodes can be eliminated efficiently. It is further revealed that the sum of absolute differences value of an image block is mainly determined by the sum of its edge gradient amplitude and the current quantization step. Consequently, the image content-based early termination algorithm is proposed, and it outperforms the original method adopted by JVT reference software. Moreover, the dynamic search range algorithm based on the edge gradient amplitude of source image block is analyzed. One eminent advantage of the proposed edgebased algorithms is their efficiency to the macroblock-pipelining architecture, and another desirable feature is their orthogonality to fast block-matching algorithms. Experimental results show that when these algorithms are integrated with hybrid unsymmetrical-cross multi-hexagongrid search, an averaged 31.4-60.0% motion estimation time can be saved, whereas the averaging BDPSNR loss is 0.0497 dB for all tested sequences.
AB - The H.264/AVC coding standard processes variable block size motion-compensated prediction with multiple reference frames to achieve a pronounced improvement in compression efficiency. Accordingly, the computation of motion estimation increases in proportion to the product of the number of reference frame and the number of intermode. The mathematical analysis in this paper illustrates that the motion-compensated prediction errors are mainly determined by the detailed textures in the source image. The image block being rich in textures contains numerous high-frequency signals, which make variable block size and multiple reference frame techniques essential. On the basis of rate-distortion theory, in this paper, the spatial homogeneity of an image block is made as a relative concept with respect to the current quantization step. For the homogenous block, its futile reference frames and intermodes can be eliminated efficiently. It is further revealed that the sum of absolute differences value of an image block is mainly determined by the sum of its edge gradient amplitude and the current quantization step. Consequently, the image content-based early termination algorithm is proposed, and it outperforms the original method adopted by JVT reference software. Moreover, the dynamic search range algorithm based on the edge gradient amplitude of source image block is analyzed. One eminent advantage of the proposed edgebased algorithms is their efficiency to the macroblock-pipelining architecture, and another desirable feature is their orthogonality to fast block-matching algorithms. Experimental results show that when these algorithms are integrated with hybrid unsymmetrical-cross multi-hexagongrid search, an averaged 31.4-60.0% motion estimation time can be saved, whereas the averaging BDPSNR loss is 0.0497 dB for all tested sequences.
KW - Edge gradient
KW - Fast mode decision
KW - H.264/AVC
KW - Motion estimation (ME)
KW - Multiple reference frame (MRF)
KW - Variable block size (VBS)
UR - http://www.scopus.com/inward/record.url?scp=69449108169&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=69449108169&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2009.2022796
DO - 10.1109/TCSVT.2009.2022796
M3 - Article
AN - SCOPUS:69449108169
SN - 1051-8215
VL - 19
SP - 1095
EP - 1107
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 8
M1 - 4914856
ER -