VoxTNT: A Multi-Scale Transformer-based Approach for 3D Object Detection in Point Clouds - Details

author：

Zheng, Qiangwen (Zheng, Qiangwen.) ^[1] | Wu, Sheng (Wu, Sheng.) ^[2] | Wei, Jinghui (Wei, Jinghui.) ^[3]

Indexed by：

Abstract：

[Background]　Traditional　methods,　due　to　their　static　receptive　field　design,　struggle　to　adapt　to　the　significant　scale　differences　among　cars,　pedestrians,　and　cyclists　in　urban　autonomous　driving　scenarios.　Moreover,　cross-scale　feature　fusion　often　leads　to　hierarchical　interference.　[Methodology]　To　address　the　key　challenge　of　cross-scale　representation　consistency　in　3D　object　detection　for　multi-class,　multi-scale　objects　in　autonomous　driving　scenarios,　this　study　proposes　a　novel　method　named　VoxTNT.　VoxTNT　leverages　an　equalized　receptive　field　and　a　local-global　collaborative　attention　mechanism　to　enhance　detection　performance.　At　the　local　level,　a　PointSetFormer　module　is　introduced,　incorporating　an　Induced　Set　Attention　Block　(ISAB)　to　aggregate　fine-grained　geometric　features　from　high-density　point　clouds　through　reduced　cross-attention.　This　design　overcomes　the　information　loss　typically　associated　with　traditional　voxel　mean　pooling.　At　the　global　level,　a　VoxelFormerFFN　module　is　designed,　which　abstracts　non-empty　voxels　into　a　super-point　set　and　applies　cross-voxel　ISAB　interactions　to　capture　long-range　contextual　dependencies.　This　approach　reduces　the　computational　complexity　of　global　feature　learning　from　O(N2)　to　O(M2)　(where　M　©　2025　Science　Press.　All　rights　reserved.

Keyword：

3D modeling Automobile drivers Autonomous vehicles Object detection Object recognition Semantics Stages Three dimensional computer graphics

Community：

[ 1 ] [Zheng, Qiangwen]The College of Computer and Data Science, Fuzhou University, Fuzhou; 350100, China
[ 2 ] [Wu, Sheng]The Academy of Digital China (Fujian), Fuzhou University, Fuzhou; 350100, China
[ 3 ] [Wei, Jinghui]The College of Computer and Data Science, Fuzhou University, Fuzhou; 350100, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

EppNet: Enhanced Pseudo and Point Cloud Fusion for 3D Object Detection
2024，6th International Conference on Next Generation Data-Driven Networks, NGDN 2024
Partial Cutting Method of the 3D Geometric Model
2018，5th IEEE International Conference on Consumer Electronics-Taiwan, ICCE-TW 2018
A 3D individual tree modeling technique based on terrestrial LiDAR point cloud data
2015，2nd IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services, ICSDM 2015
Kinematic Modeling of 3D Clearance in Revolute Joint and Its Application in Overconstrained Linkages
2023，23rd IFToMM China International Conference on Mechanism and Machine Science and Engineering, IFToMM CCMMS 2022
Theoretical and Experimental Studies on 3D Vibration Measurement Based on Displacement-sensitive Fringe Pattern
2019，Journal of Mechanical Engineering

Source ：

Journal of Geo-Information Science

ISSN： 1560-8999

Year： 2025

Issue： 6

Volume： 27

Page： 1361-1380

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to