TY - JOUR
T1 - Accelerated Deformable Part Models on GPUs
AU - Hirabayashi, Manato
AU - Kato, Shinpei
AU - Edahiro, Masato
AU - Takeda, Kazuya
AU - Mita, Seiichi
PY - 2016/6/1
Y1 - 2016/6/1
N2 - Object detection is a fundamental challenge facing intelligent applications. Image processing is a promising approach to this end, but its computational cost is often a significant problem. This paper presents schemes for accelerating the deformable part models (DPM) on graphics processing units (GPUs). DPM is a well-known algorithm for image-based object detection, and it achieves high detection rates at the expense of computational cost. GPUs are massively parallel compute devices designed to accelerate data-parallel compute-intensive workload. According to an analysis of execution times, approximately 98 percent of DPM code exhibits loop processing, which means that DPM could be highly parallelized by GPUs. In this paper, we implement DPM on the GPU by exploiting multiple parallelization schemes. Results of an experimental evaluation of this GPU-accelerated DPM implementation demonstrate that the best scheme of GPU implementations using an NVIDIA GPU achieves a speed up of 8.6x over a naive CPU-based implementation.
AB - Object detection is a fundamental challenge facing intelligent applications. Image processing is a promising approach to this end, but its computational cost is often a significant problem. This paper presents schemes for accelerating the deformable part models (DPM) on graphics processing units (GPUs). DPM is a well-known algorithm for image-based object detection, and it achieves high detection rates at the expense of computational cost. GPUs are massively parallel compute devices designed to accelerate data-parallel compute-intensive workload. According to an analysis of execution times, approximately 98 percent of DPM code exhibits loop processing, which means that DPM could be highly parallelized by GPUs. In this paper, we implement DPM on the GPU by exploiting multiple parallelization schemes. Results of an experimental evaluation of this GPU-accelerated DPM implementation demonstrate that the best scheme of GPU implementations using an NVIDIA GPU achieves a speed up of 8.6x over a naive CPU-based implementation.
KW - Deformable Part Models (DPM)
KW - Graphics Processing Unit (GPU)
KW - Image Processing
UR - http://www.scopus.com/inward/record.url?scp=84969981513&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969981513&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2015.2453962
DO - 10.1109/TPDS.2015.2453962
M3 - Article
AN - SCOPUS:84969981513
SN - 1045-9219
VL - 27
SP - 1589
EP - 1602
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 6
M1 - 7152943
ER -