TY - JOUR
T1 - Deep image compression based on multi-scale deformable convolution
AU - Li, Daowen
AU - Li, Yingming
AU - Sun, Heming
AU - Yu, Lu
N1 - Funding Information:
This work was supported by the National Natural Science Foundation of China under Grant 62071427 and U21B2004 , and Key Research and Development Program of Zhejiang Province, China under Grant 2021C01119 .
Publisher Copyright:
© 2022
PY - 2022/8
Y1 - 2022/8
N2 - Deep image compression efficiency has been improved in the past years. However, to fully exploit context information for compressing image objects of different scales and shapes, more adaptive geometric structure of inputs should be considered. In this paper, we novelly introduce deformable convolution and its spatial attention extension into deep image compression task to fully exploit the context information. Specifically, a novel deep image compression network with Multi-Scale Deformable Convolution and Spatial Attention, named MS-DCSA, is proposed to better extract compact and efficient latent representation as well as reconstruct higher-quality images. First, multi-scale deformable convolution is presented to provide multi-scale receptive fields for learning spatial sampling offsets in deformable operations. Subsequently, multi-scale deformable spatial attention module is developed to generate attention masks to re-weight extracted features according to their importance. In addition, the multi-scale deformable convolution is applied to design delicate up/down sampling modules. Extensive experiments demonstrate that the proposed MS-DCSA network achieves improved performance on both PSNR and MS-SSIM quality metrics, compared to conventional as well as competing deep image compression methods.
AB - Deep image compression efficiency has been improved in the past years. However, to fully exploit context information for compressing image objects of different scales and shapes, more adaptive geometric structure of inputs should be considered. In this paper, we novelly introduce deformable convolution and its spatial attention extension into deep image compression task to fully exploit the context information. Specifically, a novel deep image compression network with Multi-Scale Deformable Convolution and Spatial Attention, named MS-DCSA, is proposed to better extract compact and efficient latent representation as well as reconstruct higher-quality images. First, multi-scale deformable convolution is presented to provide multi-scale receptive fields for learning spatial sampling offsets in deformable operations. Subsequently, multi-scale deformable spatial attention module is developed to generate attention masks to re-weight extracted features according to their importance. In addition, the multi-scale deformable convolution is applied to design delicate up/down sampling modules. Extensive experiments demonstrate that the proposed MS-DCSA network achieves improved performance on both PSNR and MS-SSIM quality metrics, compared to conventional as well as competing deep image compression methods.
KW - Deep image compression
KW - Multi-scale deformable convolution
KW - Spatial attention
UR - http://www.scopus.com/inward/record.url?scp=85133947292&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133947292&partnerID=8YFLogxK
U2 - 10.1016/j.jvcir.2022.103573
DO - 10.1016/j.jvcir.2022.103573
M3 - Article
AN - SCOPUS:85133947292
SN - 1047-3203
VL - 87
JO - Journal of Visual Communication and Image Representation
JF - Journal of Visual Communication and Image Representation
M1 - 103573
ER -