TY - GEN
T1 - Optimization of propagate partial SAD and SAD tree motion estimation hardwired engine for H.264
AU - Zhenyu, Liu
AU - Goto, Satoshi
AU - Ikenaga, Takeshi
PY - 2008
Y1 - 2008
N2 - Variable block size motion estimation algorithm is the efficient approach to reduce the temporal redundancies and it has been adopted by the latest video coding standard H.264/AVC. The computational complexity augment coming from the variable block size technique makes the hardwired accelerator essential, especially for real-time applications. In this paper, the authors apply the architecture level and the circuits level approaches to improve the performance of Propagate Partial SAD and SAD Tree hardwired engines, which outperform other counterparts when considering the impact of supporting the variable block size technique. Experiments demonstrate that by using the proposed approaches, compared with the original architectures, 14.7% and 18.0% hardware cost can be saved for Propagate Partial SAD architecture and SAD Tree architecture, respectively. With TSMC 0.18 mm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture attains 231.6MHz operating frequency at a cost of 84.1k gates. Correspondingly, the execution speed of the optimized SAD Tree architecture is improved to 204.8MHz with 88.5k gate hardware overhead.
AB - Variable block size motion estimation algorithm is the efficient approach to reduce the temporal redundancies and it has been adopted by the latest video coding standard H.264/AVC. The computational complexity augment coming from the variable block size technique makes the hardwired accelerator essential, especially for real-time applications. In this paper, the authors apply the architecture level and the circuits level approaches to improve the performance of Propagate Partial SAD and SAD Tree hardwired engines, which outperform other counterparts when considering the impact of supporting the variable block size technique. Experiments demonstrate that by using the proposed approaches, compared with the original architectures, 14.7% and 18.0% hardware cost can be saved for Propagate Partial SAD architecture and SAD Tree architecture, respectively. With TSMC 0.18 mm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture attains 231.6MHz operating frequency at a cost of 84.1k gates. Correspondingly, the execution speed of the optimized SAD Tree architecture is improved to 204.8MHz with 88.5k gate hardware overhead.
UR - http://www.scopus.com/inward/record.url?scp=62349134012&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62349134012&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2008.4751881
DO - 10.1109/ICCD.2008.4751881
M3 - Conference contribution
AN - SCOPUS:62349134012
SN - 9781424426584
T3 - 26th IEEE International Conference on Computer Design 2008, ICCD
SP - 328
EP - 333
BT - 26th IEEE International Conference on Computer Design 2008, ICCD
T2 - 26th IEEE International Conference on Computer Design 2008, ICCD
Y2 - 12 October 2008 through 15 October 2008
ER -