Open-vocabulary 3D Semantic Understanding via Affinity Neural Radiance Fields - Details

author：

Fan, Yujie (Fan, Yujie.) ^[1] | Luo, Huan (Luo, Huan.) ^[2] (Scholars：罗欢)

Indexed by：

EI Scopus

Abstract：

Three-dimensional　semantic　understanding　using　only　several　multi-view　images　can　largely　reduce　the　communication　burden　on　the　network.　In　addition,　while　point　clouds　are　extensively　studied　for　3D　scene　understanding,　utilization　of　multi-view　image　data　offers　rich　visual　details　and　texture　information.　However,　challenges　persist　in　lifting　2D　semantic　features　to　3D　space　and　leveraging　language　for　segmentation.　Inspired　by　recent　advancements,　this　paper　proposes　a　method　that　combines　CLIP　features　and　SAM　masks　to　create　a　feature　field　capable　of　segmenting　objects　via　natural　language　text　across　2D　multi-view　and　3D　Gaussian　splatting.　It　offers　a　promising　function　for　extracting　3D　assets　for　game　engines　and　the　metaverse.　Our　method　involves　mask　generation　from　video　frames,　extracting　physical　scales　via　RGB　Nerf　with　masks,　and　organizing　hierarchical　information　for　semantic　comprehension.　In　the　training　process,　affinity　features　maintain　scale　properties　and　guide　CLIP　feature　generation　with　auto　weights　blending　for　semantic　robustness.　A　straightforward　3D　splatting　CLIP　feature　approach　and　canonical　text　methodology　enhance　query　robustness　across　2D　multi-view　and　3D　splatting　through　relevance　score　calculation　based　on　text　CLIP　features　for　inference.　Experimental　results　demonstrate　promising　improvements　in　semantic　understanding　of　3D　scenes.　©　2024　IEEE.

Keyword：

Image texture Natural language processing systems Query languages Semantics Semantic Segmentation

Community：

[ 1 ] [Fan, Yujie]Computer and Data Science, Fuzhou University, Fuzhou, China
[ 2 ] [Fan, Yujie]Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing, Fuzhou University, Fuzhou, China
[ 3 ] [Luo, Huan]Computer and Data Science, Fuzhou University, Fuzhou, China
[ 4 ] [Luo, Huan]Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing, Fuzhou University, Fuzhou, China

Reprint 's Address：

Email：

Show more details

Version：

Open-vocabulary 3D Semantic Understanding via Affinity Neural Radiance Fields
2024，2024 6th International Conference on Next Generation Data-Driven Networks, NGDN 2024

Related Keywords：

Cross-scale feature extraction module for efficient RGBD images semantic segmentation
2022，2021 International Conference on Computer Vision and Pattern Analysis, ICCPA 2021
Improving Water Hyacinth Extraction from UAV Images Using Enhanced U-Net
2025，7th International Conference on Wireless Communications, Networking and Applications, WCNA 2023
Denoising and Restoring of Infrared Image of Power Equipment Based on l2-relaxed l0 Sparse Analysis Priors
2022，Recent Advances in Electrical and Electronic Engineering
CN-LBP: Complex Networks-Based Local Binary Patterns for Texture Classification
2021，18th International Conference on Wavelet Analysis and Pattern Recognition, ICWAPR 2021

Source ：

Year： 2024

Page： 464-467

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

计算机与大数据学院、软件学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to