Indexed by:
Abstract:
Speaker recognition is a cutting-edge technology that focuses on identifying individuals based on their unique voice characteristics. To address the challenges associated with data collection, we have leveraged deep learning techniques to introduce two innovative and lightweight speaker recognition models: Sinc-MN1D and AAM-Sinc-MN1D. These models integrate the latest advancements in deep learning and speaker verification by utilizing a modified MobileNetV2 framework as the core module.To capture essential short-term speaker features effectively, we have meticulously replaced the initial convolutional layer of the backbone network with a positively modified convolutional layer inspired by the optimized SincNet. Furthermore, to enhance the extraction of critical frequency features, we have incorporated the AAM-softmax loss function, commonly used in face recognition, to enhance the models capability in identifying challenging samples. Our method has been rigorously evaluated on the TIMIT dataset, demonstrating superior performance compared to the baseline approach.
Keyword:
Reprint 's Address:
Version:
Source :
2024 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS, NANA 2024
Year: 2024
Page: 453-458
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: