TY - GEN
T1 - P2Net
T2 - 2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
AU - Momma, Yutaka
AU - Wang, Weimin
AU - Simo-Serra, Edgar
AU - Iizuka, Satoshi
AU - Nakamura, Ryosuke
AU - Ishikawa, Hiroshi
N1 - Funding Information:
This paper is based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO). Hiroshi Ishikawa was partially supported by JSPS KAKENHI Grant number JP20H00615.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/11
Y1 - 2020/10/11
N2 - We present a lightweight post-processing method to refine the semantic segmentation results of point cloud sequences. Most existing methods usually segment frame by frame and encounter the inherent ambiguity of the problem: based on a measurement in a single frame, labels are sometimes difficult to predict even for humans. To remedy this problem, we propose to explicitly train a network to refine these results predicted by an existing segmentation method. The network, which we call the P2Net, learns the consistency constraints between "coincident" points from consecutive frames after registration. We evaluate the proposed post-processing method both qualitatively and quantitatively on the SemanticKITTI dataset that consists of real outdoor scenes. The effectiveness of the proposed method is validated by comparing the results predicted by two representative networks with and without the refinement by the post-processing network. Specifically, qualitative visualization validates the key idea that labels of the points that are difficult to predict can be corrected with P2Net. Quantitatively, overall mIoU is improved from 10.5% to 11.7% for PointNet [1] and from 10.8% to 15.9% for PointNet++ [2].
AB - We present a lightweight post-processing method to refine the semantic segmentation results of point cloud sequences. Most existing methods usually segment frame by frame and encounter the inherent ambiguity of the problem: based on a measurement in a single frame, labels are sometimes difficult to predict even for humans. To remedy this problem, we propose to explicitly train a network to refine these results predicted by an existing segmentation method. The network, which we call the P2Net, learns the consistency constraints between "coincident" points from consecutive frames after registration. We evaluate the proposed post-processing method both qualitatively and quantitatively on the SemanticKITTI dataset that consists of real outdoor scenes. The effectiveness of the proposed method is validated by comparing the results predicted by two representative networks with and without the refinement by the post-processing network. Specifically, qualitative visualization validates the key idea that labels of the points that are difficult to predict can be corrected with P2Net. Quantitatively, overall mIoU is improved from 10.5% to 11.7% for PointNet [1] and from 10.8% to 15.9% for PointNet++ [2].
KW - Point cloud Sequences
KW - PointNet
KW - Semantic Segmentation
KW - Spatial Consistency
UR - http://www.scopus.com/inward/record.url?scp=85098868376&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098868376&partnerID=8YFLogxK
U2 - 10.1109/SMC42975.2020.9283329
DO - 10.1109/SMC42975.2020.9283329
M3 - Conference contribution
AN - SCOPUS:85098868376
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 4110
EP - 4115
BT - 2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 October 2020 through 14 October 2020
ER -