Indexed by:
Abstract:
Camera-based stereo 3D object detection estimates 3D properties of objects with binocular images only, which is a cost-effective solution for autonomous driving. The state-of-the-art methods mainly improve the detection accuracy of general objects by designing ingenious stereo matching algorithms or complex pipeline modules. Moreover, additional fine-grained annotations, such as masks or LiDAR point clouds, are often introduced to deal with the occlusion problems, which brings in high manual costs for this task. To address the detection bottleneck caused by occlusion in a more cost-effective manner, we develop a novel stereo 3D object detection method named DSC3D, which achieves significant improvements for occluded objects without introducing additional supervision. Specifically, we first report the ambiguity in feature sampling, which refers to the presence of noisy features in the sampling for occluded objects. Then, we propose the Epipolar Constraint Deform-Attention (ECDA) module to address the unreliable left-right correspondence computation in stereo matching caused by occlusion, which reweights epipolar features by adaptively aggregating local neighbor information. Furthermore, to ensure that 3D property estimation is based on robust object features, we propose visible regions guided constraint to explicitly guide the offset learning for feature sampling. Extensive experiments conducted on the KITTI benchmark have demonstrated the proposed DSC3D outperforms the state-of-the-art camera-based methods.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
ISSN: 1051-8215
Year: 2025
Issue: 3
Volume: 35
Page: 2794-2805
8 . 3 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 3
Affiliated Colleges: