TY - GEN
T1 - Efficient Computational Scheduling of Box and Gaussian FIR Filtering for CPU Microarchitecture
AU - Fukushima, Norishige
AU - Maeda, Yoshihiro
AU - Kawasaki, Yuki
AU - Nakamura, Masahiro
AU - Tsumura, Tomoaki
AU - Sugimoto, Kenjiro
AU - Kamata, Sei Ichiro
N1 - Funding Information:
This work was supported by JSPS KAKENHI (JP17H01764, 18K18076).
Publisher Copyright:
© 2018 APSIPA organization.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - In this paper, we propose efficient computational scheduling of box and Gaussian filtering. These filters are fundamental tools and used for various applications. The computational order of the naïve implementations of these FIR filters are O(r^{2}), where r is the kernel radius. A separable implementation reduces the order into O(r) but requires twice times of filtering. A recursive representation dramatically sheds the order into O(1) but also needs twice or more times filtering. The efficient representation curtails the number of arithmetic operations; however, the influence of data I/O for the computational time becomes dominant. In this paper, we optimize the computational scheduling of O(1) box and Gaussian filters to competently utilize cache memory for reducing the computational time of data I/O. Experimental results show that the proposed scheduling has higher computational performance than the conventional implementation.
AB - In this paper, we propose efficient computational scheduling of box and Gaussian filtering. These filters are fundamental tools and used for various applications. The computational order of the naïve implementations of these FIR filters are O(r^{2}), where r is the kernel radius. A separable implementation reduces the order into O(r) but requires twice times of filtering. A recursive representation dramatically sheds the order into O(1) but also needs twice or more times filtering. The efficient representation curtails the number of arithmetic operations; however, the influence of data I/O for the computational time becomes dominant. In this paper, we optimize the computational scheduling of O(1) box and Gaussian filters to competently utilize cache memory for reducing the computational time of data I/O. Experimental results show that the proposed scheduling has higher computational performance than the conventional implementation.
UR - http://www.scopus.com/inward/record.url?scp=85063550799&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063550799&partnerID=8YFLogxK
U2 - 10.23919/APSIPA.2018.8659674
DO - 10.23919/APSIPA.2018.8659674
M3 - Conference contribution
AN - SCOPUS:85063550799
T3 - 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings
SP - 875
EP - 879
BT - 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018
Y2 - 12 November 2018 through 15 November 2018
ER -