• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Lin, JunJie (Lin, JunJie.) [1]

Indexed by:

EI Scopus

Abstract:

As a deep residual network model, Resnet50 has significant practical significance in image classification, target recognition, and image semantic recognition. In this paper, Nvidia RTX 4090 GPU is used to conduct detailed performance testing and bottleneck analysis for Resnet50 training and inference, including specific calculation delay and data processing delay under different batch sizes. In order to verify the overall acceleration effect of Resnet50, we use two optimization methods on the basis of GPU computing acceleration: the first is to use mixed precision to improve GPU training and inference efficiency, and the second is to use DALI to optimize data preprocessing and reduce data loading delay. The experimental results show that when the batch size is 256, the mixed precision is improved by about 90% compared with FP32, but the overall performance improvement is not obvious. When using mixed precision and DALI for GPU computing and data loading optimization at the same time, it can bring 1.4 and 2.5 times improvement in the overall performance of training and inference. The experimental results show that only using the mixed precision can not improve the overall computing efficiency of the system, and the data loading time cost frequently limits the end-to-end performance. Therefore, only by optimizing GPU computation and data loading delay at the same time can end users get a significant speed increase. This paper is of great significance to evaluate and improve the computational acceleration performance of GPU-based deep neural networks. © 2024 SPIE.

Keyword:

Batch data processing Computer vision Data handling Deep neural networks Efficiency Graphics processing unit Image classification Semantics

Community:

  • [ 1 ] [Lin, JunJie]Maynooth College of Engineering, Fuzhou University, Fujian, Fuzhou; 350108, China

Reprint 's Address:

Email:

Show more details

Version:

Related Keywords:

Related Article:

Source :

ISSN: 0277-786X

Year: 2024

Volume: 13184

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 2

Affiliated Colleges:

Online/Total:181/10202784
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1