Citation: | WANG Yixian, HU Yufan, KONG Qingqun, ZENG Hui, ZHANG Lixin, FAN Bin. 3D point cloud semantic segmentation: state of the art and challenges[J]. Chinese Journal of Engineering, 2023, 45(10): 1653-1665. doi: 10.13374/j.issn2095-9389.2022.12.17.004 |
[1] |
Riemenschneider H, Bódis-Szomorú A, Weissenberg J, et al. Learning where to classify in multi-view semantic segmentation // European Conference on Computer Vision. Zurich, 2014: 516
|
[2] |
Armeni I, Sener O, Zamir A R, et al. 3D semantic parsing of large-scale indoor spaces // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 1534
|
[3] |
Wu W X, Qi Z A, Li F X. PointConv: deep convolutional networks on 3D point clouds // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2020: 9613
|
[4] |
Wang Y, Sun Y B, Liu Z W, et al. Dynamic graph CNN for learning on point clouds. ACM Trans Graph, 2019, 38(5): 1
|
[5] |
Zhao H S, Jiang L, Jia J Y, et al. Point transformer // 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, 2021: 16259
|
[6] |
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504 doi: 10.1126/science.1127647
|
[7] |
Guo Y L, Wang H Y, Hu Q Y, et al. Deep learning for 3D point clouds: A survey. IEEE Trans Pattern Anal Mach Intell, 2020, 43(12): 4338
|
[8] |
Xie Y X, Tian J J, Zhu X X. Linking points with labels in 3D: A review of point cloud semantic segmentation. IEEE Geosci Remote Sens Mag, 2020, 8(4): 38 doi: 10.1109/MGRS.2019.2937630
|
[9] |
He Y, Yu H S, Liu X Y, et al. Deep learning based 3D segmentation: A survey [J/OL]. arXiv Preprint (2021-3-10) [2022-12-17]. https://arxiv.org/abs/2103.05423&p;shy;
|
[10] |
Lahoud J, Cao J L, Khan F S, H, et al. 3D Vision with Transformers: A Survey [J/OL]. arXiv preprint (2022-8-8) [2022-12-17]. https://arxiv.org/abs/2208.04309
|
[11] |
Lu D N, Xie Q, Wei M Q, et al. Transformers in 3D point clouds: A survey [J/OL]. arXiv preprint (2017-5-24) [2022-12-17]. https://arxiv.org/abs/2205.07417
|
[12] |
Zeng J H, Wang D C, Chen P. A survey on transformers for point cloud processing: An updated overview. IEEE Access, 2022, 10: 86510 doi: 10.1109/ACCESS.2022.3198999
|
[13] |
Gao B, Pan Y C, Li C K, et al. Are we hungry for 3D LiDAR data for semantic segmentation? A survey of datasets and methods. IEEE Trans Intell Transp Syst, 2021, 23(7): 6063
|
[14] |
Hackel T, Savinov N, Ladicky L, et al. Semantic3D. net: A new large-scale point cloud classification benchmark [J/OL]. arXiv preprint (2017-5-24) [2022-12-17]. https://arxiv.org/abs/1704.03847
|
[15] |
Behley J, Garbade M, Milioto A, et al. SemanticKITTI: A dataset for semantic scene understanding of LiDAR sequences // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 9296
|
[16] |
Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite // 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, 2012: 3354
|
[17] |
Dai A, Chang A X, Savva M, et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 5828
|
[18] |
Liang Z D, Yang M, Deng L Y, et al. Hierarchical depthwise graph convolutional neural network for 3D semantic segmentation of point clouds // 2019 International Conference on Robotics and Automation (ICRA). Montreal, 2019: 8152
|
[19] |
Li Y, Ma L F, Zhong Z L, et al. TGNet: Geometric graph CNN on 3-D point cloud segmentation. IEEE Trans Geosci Remote Sens, 2019, 58(5): 3588
|
[20] |
Choy C, Gwak J Y, Savarese S. 4d spatio-temporal convnets: Minkowski convolutional neural networks // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 3075
|
[21] |
Wang L, Huang Y C, Hou Y L, et al. Graph attention convolution for point cloud semantic segmentation // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 10296
|
[22] |
Yang J C, Zhang Q, Ni B B, et al. Modeling point clouds with self-attention and gumbel subset sampling // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: 2019: 3323
|
[23] |
Zhao H S, Jiang L, Fu C W, et al. PointWeb: enhancing local neighborhood features for point cloud processing // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 5565
|
[24] |
Komarichev A, Zhong Z C, Hua J. A-CNN: Annularly convolutional neural networks on point clouds // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 7421
|
[25] |
Meng H Y, Gao L, Lai Y K, et al. VV-net: Voxel VAE net with group convolutions for point cloud segmentation // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 8500
|
[26] |
Li G H, Müller M, Thabet A, et al. DeepGCNs: can GCNs go as deep as CNNs? // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 9267
|
[27] |
Thomas H, Qi C R, Deschaud J E, et al. KPConv: flexible and deformable convolution for point clouds // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 6411
|
[28] |
Milioto A, Vizzo I, Behley J, et al. RangeNet++: fast and accurate LiDAR semantic segmentation // 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Macau, 2019: 4213
|
[29] |
Ma Y N, Guo Y L, Liu H, et al. Global context reasoning for semantic segmentation of 3D point clouds // 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). Snowmass, 2020: 2931
|
[30] |
Shi H Y, Lin G S, Wang H, et al. SpSequenceNet: semantic segmentation network on 4D point clouds // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, 2020: 4574
|
[31] |
Hu Q Y, Yang B, Xie L H, et al. Randla-net: Efficient semantic segmentation of large-scale point clouds // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 2020: 11108
|
[32] |
Lei H, Akhtar N, Mian A. SegGCN: efficient 3D point cloud segmentation with fuzzy spherical kernel // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, 2020: 11611
|
[33] |
Xu C F, Wu B C, Wang Z N, et al. SqueezeSegV3: Spatially-adaptive convolution for efficient point-cloud segmentation // European Conference on Computer Vision. Glasgow, 2020: 1
|
[34] |
Xie Z Y, Chen J Z, Peng B. Point clouds learning with attention-based graph convolution networks. Neurocomputing, 2020, 402: 245 doi: 10.1016/j.neucom.2020.03.086
|
[35] |
Lei H, Akhtar N, Mian A. Spherical kernel for efficient graph convolution on 3D point clouds. IEEE Trans Pattern Anal Mach Intell, 2020, 43(10): 3664
|
[36] |
Wen X, Han Z Z, Youk G, et al. CF-SIS: Semantic-instance segmentation of 3D point clouds by context fusion with self-attention // Proceedings of the 28th ACM International Conference on Multimedia. Seattle, 2020: 1661
|
[37] |
Feng M T, Zhang L, Lin X F, et al. Point attention network for semantic segmentation of 3D point clouds. Pattern Recognit, 2020, 107: 107446 doi: 10.1016/j.patcog.2020.107446
|
[38] |
Zhang G G, Ma Q H, Jiao L C, et al. AttAN: Attention adversarial networks for 3D point cloud semantic segmentation // Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. Yokohama, 2020: 789
|
[39] |
Huang H, Fang Y. Adaptive wavelet transformer network for 3D shape representation learning // International Conference on Learning Representations. Hefei, 2022: 1
|
[40] |
Xu M T, Ding R Y, Zhao H S, et al. PAConv: position adaptive convolution with dynamic kernel assembling on point clouds // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, 2021: 3173
|
[41] |
Fan H H, Yang Y, Kankanhalli M. Point 4D transformer networks for spatio-temporal modeling in point cloud videos // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, 2021: 14204
|
[42] |
Guo M H, Cai J X, Liu Z N, et al. PCT: Point cloud transformer. Comp Visual Media, 2021, 7(2): 187 doi: 10.1007/s41095-021-0229-5
|
[43] |
Zhang C, Wan H C, Shen X Y, et al. PVT: Point-voxel transformer for point cloud learning [J/OL]. arXiv preprint (2022-5-25) [2022-12-17].https://arxiv.org/abs/2108.06076
|
[44] |
Wan J, Xie Z, Xu Y Y, et al. DGANet: A dilated graph attention-based network for local feature extraction on 3D point clouds. Remote Sens, 2021, 13(17): 3484 doi: 10.3390/rs13173484
|
[45] |
Wei Y M, Liu H, Xie T T, et al. Spatial-temporal transformer for 3D point cloud sequences // 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, 2022: 1171
|
[46] |
Gao Y B, Liu X B, Li J, et al. LFT-net: Local feature transformer network for point clouds analysis. IEEE Trans Intell Transp Syst, 2023, 24(2): 2158
|
[47] |
Park C, Jeong Y, Cho M, et al. Fast point transformer // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans 2022: 16949
|
[48] |
Lai X, Liu J H, Jiang L, et al. Stratified transformer for 3D point cloud segmentation // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, 2022: 8500
|
[49] |
Xu S J, Wan R, Ye M S, et al. Sparse cross-scale attention network for efficient LiDAR panoptic segmentation // Proceedings of the AAAI Conference on Artificial Intelligence. Online, 2022: 2920
|
[50] |
Yu X M, Tang L L, Rao Y M, et al. Point-BERT: Pre-training 3D point cloud transformers with masked point modeling // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, 2022: 19313
|
[51] |
Fu K X, YuanM Z, Wang M N. Point-McBert: A Multi-choice self-supervised framework for point cloud pre-training [J/OL]. arXiv preprint (2022-8-15) [2022-12-17]. https://arxiv.org/abs/2207.13226
|
[52] |
Zeng Z Y, Xu Y Y, Xie Z, et al. RG-GCN: A random graph based on graph convolution network for point cloud semantic segmentation. Remote Sens, 2022, 14: 4055 doi: 10.3390/rs14164055
|
[53] |
Wu Y X, Liao K L, Chen J T, et al. D-former: A u-shaped dilated transformer for 3d medical image segmentation. Neural Comput Appl, 2022, 35: 1931
|
[54] |
Qian G C, Zhang X D, Hamdi A, et al. Improving standard transformer models for 3D point cloud understanding with image pretraining [J/OL]. arXiv preprint (2022-11-22) [2022-12-17]. https://arxiv.org/abs/2208.12259
|
[55] |
Yan X, Gao J T, Zheng C D, et al. 2DPASS: 2D priors assisted semantic segmentation on LiDAR point clouds // European Conference on Computer Vision. Tel Aviv, 2022: 677
|
[56] |
Wu X Y, Lao Y X, Jiang L, et al. Point transformer V2: Grouped vector attention and partition-based pooling [J/OL]. arXiv preprint (2022-10-11) [2022-12-17]. https://arxiv.org/abs/2210.05666
|
[57] |
Mousavian A, Pirsiavash H, Košecká J. Joint semantic segmentation and depth estimation with deep convolutional networks // 2016 Fourth International Conference on 3D Vision (3DV). Stanford, 2016: 611
|
[58] |
Charles R Q, Hao S, Mo K C, et al. PointNet: deep learning on point sets for 3D classification and segmentation // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 652
|
[59] |
Wu B C, Zhou X Y, Zhao S C, et al. SqueezeSegV2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a LiDAR point cloud // 2019 International Conference on Robotics and Automation (ICRA). New York, 2019: 4376
|
[60] |
Wu B C, Wan A, Yue X Y, et al. SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud // 2018 IEEE International Conference on Robotics and Automation (ICRA). Brisbane, 2018: 1887
|
[61] |
Xu Q G, Sun X D, Wu C Y, et al. Grid-GCN for fast and scalable point cloud learning // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, 2020: 5661
|
[62] |
Lei H, Akhtar N, Mian A. Octree guided CNN with spherical kernels for 3D point clouds // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 9631
|
[63] |
Liang Z D, Yang M, Li H, et al. 3D instance embedding learning with a structure-aware loss function for point cloud segmentation. IEEE Robotics Autom Lett, 2020, 5(3): 4915 doi: 10.1109/LRA.2020.3004802
|
[64] |
Qi C R, Yi L, Su H, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space // Advances in Neural Information Processing Systems. Long Beach, 2017: 1
|
[65] |
劉建偉, 劉俊文, 羅雄麟. 深度學習中注意力機制研究進展. 工程科學學報, 2021, 43(11):1499
Liu J W, Liu J W, Luo X L. Research progress in attention mechanism in deep learning. Chin J Eng, 2021, 43(11): 1499
|
[66] |
Guo M H, Xu T X, Liu J J, et al. Attention mechanisms in computer vision: A survey. Comput Vis Media, 2022, 8(3): 331 doi: 10.1007/s41095-022-0271-y
|
[67] |
Thyagharajan A, Ummenhofer B, Laddha P, et al. Segment-fusion: Hierarchical context fusion for robust 3D semantic segmentation // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, 2022: 1236
|
[68] |
Li R H, Li X Z, Heng P A, et al. PointAugment: an auto-augmentation framework for point cloud classification // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, 2020: 6378
|
[69] |
Xiao A R, Huang J X, Guan D Y, et al. Unsupervised representation learning for point clouds: A survey [J/OL]. arXiv preprint (2022-6-5) [2022-12-17]. https://arxiv.org/abs/2202.13589
|
[70] |
Liu M H, Zhou Y, Qi C R, et al. LESS: Label-efficient semantic segmentation for LiDAR point clouds // European Conference on Computer Vision. Tel Aviv, 2022: 70
|
[71] |
Jhaldiyal A, Chaudhary N. Semantic segmentation of 3D LiDAR data using deep learning: A review of projection-based methods. Appl Intell, 2023, 53(6): 6844 doi: 10.1007/s10489-022-03930-5
|
[72] |
Guo M H, Lu C Z, Hou Q B, et al. SegNeXt: Rethinking convolutional attention design for semantic segmentation [J/OL]. arXiv preprint (2022-9-18) [2023-12-17]. https://arxiv.org/abs/2209.08575
|
[73] |
Qian G C, Li Y C, Peng H W, et al. PointNeXt: Revisiting PointNet++ with improved training and scaling strategies [J/OL]. arXiv preprint (2022-10-12) [2022-12-17]. https://arxiv.org/abs/2206.04670
|
[74] |
Xie X, Bai L, Huang X M. Real-time LiDAR point cloud semantic segmentation for autonomous driving. Electronics, 2021, 11(1): 11 doi: 10.3390/electronics11010011
|