Query:
学者姓名:王仁平
Refining:
Year
Type
Indexed by
Source
Complex
Co-
Language
Clean All
Abstract :
该文章提出一种新的传输技术--基于AMBA(Advanced Microcontroller Bus Architecture)总线的多请求DMAC(Direct Memory Access Controller)设计方法.该DMAC支持CPU通过APB总线来进行寄存器配置,通过AHB总线进行内存数据搬运,支持跨时钟域传输,支持链表传输模式,提高了DMAC系统的通用性,采用了内部多套请求寄存器以及内置轮询权重仲裁器的方式实现了多请求轮询传输模式,提高了DMA系统在面对多个传输请求时的灵活性,通过对比正常传输模式节省大量传输时间,实现了49%的效率提升.设计基于UVM(Universal Verification Methodology)验证平台,提出通用scoreboard设计方法快速定位DMA传输过程中数据比对出错位置,实现100%功能覆盖率.
Keyword :
AMBA AMBA DMA DMA SoC SoC UVM UVM 轮询仲裁 轮询仲裁
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 蔡挺 , 王仁平 , 卢朝辉 . 基于AMBA总线协议的多请求DMAC设计及UVM验证 [J]. | 电子制作 , 2024 , 32 (1) : 3-7 . |
MLA | 蔡挺 等. "基于AMBA总线协议的多请求DMAC设计及UVM验证" . | 电子制作 32 . 1 (2024) : 3-7 . |
APA | 蔡挺 , 王仁平 , 卢朝辉 . 基于AMBA总线协议的多请求DMAC设计及UVM验证 . | 电子制作 , 2024 , 32 (1) , 3-7 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Graph convolution networks (GCN) have demonstrated success in learning graph structures; however, they are limited in inductive tasks. Graph attention networks (GAT) were proposed to address the limitations of GCN and have shown high performance in graph -based tasks. Despite this success, GAT faces challenges in hardware acceleration, including: 1) The GAT algorithm has difficulty adapting to hardware; 2) challenges in efficiently implementing Sparse matrix multiplication (SPMM); and 3) complex addressing and pipeline stall issues due to irregular memory accesses. To this end, this paper proposed SH-GAT, an FPGA-based GAT accelerator that achieves more efficient GAT inference. The proposed approach employed several optimizations to enhance GAT performance. First, this work optimized the GAT algorithm using split weights and softmax approximation to make it more hardware -friendly. Second, a load -balanced SPMM kernel was designed to fully leverage potential parallelism and improve data throughput. Lastly, data preprocessing was performed by pre -fetching the source node and its neighbor nodes, effectively addressing pipeline stall and complexly addressing issues arising from irregular memory access. SH-GAT was evaluated on the Xilinx FPGA Alveo U280 accelerator card with three popular datasets. Compared to existing CPU, GPU, and state-of-the-art (SOTA) FPGA-based accelerators, SH-GAT can achieve speedup by up to 3283x, 13x, and 2.3x.
Keyword :
accelerator accelerator co-design co-design FPGA FPGA graph graph graph attention networks graph attention networks
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wang, Renping , Li, Shun , Tang, Enhao et al. SH-GAT: Software-hardware co-design for accelerating graph attention networks on FPGA [J]. | ELECTRONIC RESEARCH ARCHIVE , 2024 , 32 (4) : 2310-2322 . |
MLA | Wang, Renping et al. "SH-GAT: Software-hardware co-design for accelerating graph attention networks on FPGA" . | ELECTRONIC RESEARCH ARCHIVE 32 . 4 (2024) : 2310-2322 . |
APA | Wang, Renping , Li, Shun , Tang, Enhao , Lan, Sen , Liu, Yajing , Yang, Jing et al. SH-GAT: Software-hardware co-design for accelerating graph attention networks on FPGA . | ELECTRONIC RESEARCH ARCHIVE , 2024 , 32 (4) , 2310-2322 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
The Internet of Things (IoT) is a crucial component of the contemporary information industry and represents a significant advancement in information technology aimed at enhancing both human productivity and daily existence. Their applications are extensive and far-reaching. However, the present state of research on the design of low-cost IoT SoC chips leveraging open-source instruction set architectures lacks the requisite depth and breadth. To meet the requirements of low-cost IoT system-on-chip (SoC) development, this study presents a commodity code recognition SoC chip based on the RISC-V instruction set architecture, which is capable of performing image acquisition and barcode recognition. The proposed system comprises two main components: a low-power RISC-V processor and an image recognition module. This study initially enhanced the speed, accuracy, and area efficiency of the hardware design of a commodity barcode image-recognition module. Subsequently, the image recognition control module was developed using the RISC-V processor and CMOS image sensor OV7670, and the outcomes of image recognition were accessed through interrupts. The processing speed for collecting and identifying $640\times 480$ images on the FPGA board was 11.4FPS, and the image recognition rate was 99.5%. The chip was taped-out using the UMC55n process, which successfully decoded the barcodes and output the results at a working frequency of 40 MHz.
Keyword :
Barcode Barcode MPW MPW RISC-V RISC-V SoC chip SoC chip
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, Sijie , Wang, Renping , Cai, Ting et al. A Custom RISC-V Based SOC Chip for Commodity Barcode Identification [J]. | IEEE ACCESS , 2024 , 12 : 61708-61716 . |
MLA | Lin, Sijie et al. "A Custom RISC-V Based SOC Chip for Commodity Barcode Identification" . | IEEE ACCESS 12 (2024) : 61708-61716 . |
APA | Lin, Sijie , Wang, Renping , Cai, Ting , Zeng, Yunze . A Custom RISC-V Based SOC Chip for Commodity Barcode Identification . | IEEE ACCESS , 2024 , 12 , 61708-61716 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
在集成电路的实际运用中,由于电压的不稳定和芯片电流密度的提升,容易导致电压降过大,从而影响芯片的性能和可靠性。为了提高集成电路的性能和可靠性,需要对电路中的电压降进行分析和优化。本文基于中芯国际55nm工艺完成了一款条形码模块的物理实现,结合业界主流EDA工具的自动化修复流程对其电压降进行分析和优化,在静态电压降和动态电压降分析过程中电压降分别降低了21.37%和27.79%,最终达到签核要求。与传统手动修复电压降的方法相比,该方法可以有效减小修复电压降过程中对绕线资源的占用,达到提高芯片有效利用率的目的。
Keyword :
功耗优化技术 功耗优化技术 动态电压降 动态电压降 电压降分析 电压降分析 静态电压降 静态电压降
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 雷传煌 , 王仁平 , 卢朝辉 . 条形码模块的电压降分析与优化 [J]. | 电子制作 , 2024 , 32 (03) : 55-58,35 . |
MLA | 雷传煌 et al. "条形码模块的电压降分析与优化" . | 电子制作 32 . 03 (2024) : 55-58,35 . |
APA | 雷传煌 , 王仁平 , 卢朝辉 . 条形码模块的电压降分析与优化 . | 电子制作 , 2024 , 32 (03) , 55-58,35 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
目前很多的应用都需要用图数据来表示和处理,图数据是位于非欧几里得空间中的不规则数据,出于图数据处理的需求,图卷积神经网络(GCN)应运而生。GCN的主要处理步骤有:聚合,转换和激活。在本文中,我们采用一种异构模式对GCN的推理过程进行加速。根据数据本身的特点,在转换阶段,加速器采用脉动阵列执行计算来改善数据流,在聚合阶段,将所要处理的负载分成两种类型,有助于改善聚合阶段计算过程中的负载不平衡现象,同时在一定程度上缩短计算时间。最后,通过在Xilinx Virtex UltraScale+VU37P HBM FPGA平台上进行性能评估,本工作相对于CPU和GPU分别实现了平均389.19×和6.73×的加速。
Keyword :
图卷积 图卷积 机器学习 机器学习 硬件加速 硬件加速
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 常静涛 , 王仁平 . 基于图卷积的神经网络硬件加速器设计 [J]. | 中国集成电路 , 2024 , 33 (Z1) : 24-29,50 . |
MLA | 常静涛 et al. "基于图卷积的神经网络硬件加速器设计" . | 中国集成电路 33 . Z1 (2024) : 24-29,50 . |
APA | 常静涛 , 王仁平 . 基于图卷积的神经网络硬件加速器设计 . | 中国集成电路 , 2024 , 33 (Z1) , 24-29,50 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
为解决CLAHE算法硬件资源消耗量大的问题,从硬件实现的角度对算法进行两方面改进。针对裁剪阈值,提出了一种普适性裁剪阈值确定方法,基于信息熵和结构相似性构造了品质因数,以品质因数最佳作为评判标准确定硬件实现中的裁剪阈值,在平衡图像增强对比度和失真度的同时,避免消耗硬件资源对图像数据本身进行大量计算。针对超阈值像素再分配,提出了一种改进型分配方法,将超阈值像素仅均分给未超阈值的灰度级,且若其再次超阈值则停止分配,在降低图像失真度的同时,避免反复像素分配带来的硬件开销。在改进型CLAHE算法的基础上,完成基于FPGA的低照度图像增强系统实现,实验结果表明,在普适性裁剪阈值下,增强后的图像能够普遍获得更高的品质因数,具有更佳的综合效果;改进型像素再分配方法对比常规方法,图像在信息熵平均损失3.28%的代价下结构相似性可平均提升8.88%;低照度图像增强系统可实现640×480@60 fps的图像采集与处理。本设计可为图像增强算法的硬件实现提供一种新的参考。
Keyword :
CLAHE改进算法 CLAHE改进算法 FPGA FPGA 像素再分配 像素再分配 图像增强 图像增强 裁剪阈值 裁剪阈值
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 林立芃 , 杨朝阳 , 伍明诚 et al. 改进型CLAHE图像增强算法及其FPGA实现 [J]. | 电子测量技术 , 2024 , 47 (10) : 126-133 . |
MLA | 林立芃 et al. "改进型CLAHE图像增强算法及其FPGA实现" . | 电子测量技术 47 . 10 (2024) : 126-133 . |
APA | 林立芃 , 杨朝阳 , 伍明诚 , 王仁平 , 阴亚东 . 改进型CLAHE图像增强算法及其FPGA实现 . | 电子测量技术 , 2024 , 47 (10) , 126-133 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
A voltage reference is indispensable in Integrated Circuits. To improve the limited linear output voltage range and energy efficiency of a voltage reference, we innovatively propose a switched-capacitor-based programmable voltage reference scheme employing inverter-based OTAs to reduce the power consumption, simultaneously using a novel Correlated Level Shifting (CLS) technique (without active overhead) to enhance the OTA's DC gain and integral gain. Experimented with SMIC 180 nm CMOS technology, a scheme-based voltage reference realizes a programable output voltage range from 266 to 995 mV at -30 to 120 degrees C, and the corresponding temperature coefficient (TC) ranges from 82.4 to 99.5 ppm/degrees C. The power consumption is 976 nW. Furthermore, comparative experiments and evaluations with other schemes have unequivocally verified the superiority of our proposed scheme, characterized by its high energy efficiency and wide output voltage range. The scheme can be suitably deployed in a multitude of novel edge-data processing systems.
Keyword :
correlated level shifting correlated level shifting switched-capacitor switched-capacitor voltage reference voltage reference
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wei, Rongshan , Chen, Chu , Wei, Cong et al. An Energy-Efficient Inverter-Based Voltage Reference Scheme with Wide Output Range Using Correlated Level Shifting Technique [J]. | ELECTRONICS , 2023 , 12 (24) . |
MLA | Wei, Rongshan et al. "An Energy-Efficient Inverter-Based Voltage Reference Scheme with Wide Output Range Using Correlated Level Shifting Technique" . | ELECTRONICS 12 . 24 (2023) . |
APA | Wei, Rongshan , Chen, Chu , Wei, Cong , Wang, Renping , Huang, Lijie , Zhou, Qikun et al. An Energy-Efficient Inverter-Based Voltage Reference Scheme with Wide Output Range Using Correlated Level Shifting Technique . | ELECTRONICS , 2023 , 12 (24) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
This paper presents a low-power capacitive-to-digital converter (CDC) based on incremental delta-sigma modulator. It utilizes a zoom-in sensing capacitor that is insensitive to parasitic capacitance, improving the capacitance resolution. The use of a high-gain, PVT-robust current-starved OTA and a dynamic bias comparator enhances the efficiency of the system. An ultra-low-power bias circuit is integrated into the system, further improving integration and efficiency. The proposed CDC is fabricated using a 180 nm CMOS process. Operating at a 1.2 V supply voltage and a 250 kHz sampling frequency. With a measurement time of 0.8 ms, the capacitance resolution is 107.6 aF, and the power consumption is 10.27 mu W. The figure-of-merits (FoM) is 2.06 pJ/step.
Keyword :
Capacitive-to-digital Capacitive-to-digital Current-starved OTA Current-starved OTA Delta-sigma Delta-sigma Low power Low power
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wei, Rongshan , Wei, Cong , Huang, Lijie et al. Low power capacitive-to-digital converter based on incremental delta-sigma modulator [J]. | MICROELECTRONICS JOURNAL , 2023 , 142 . |
MLA | Wei, Rongshan et al. "Low power capacitive-to-digital converter based on incremental delta-sigma modulator" . | MICROELECTRONICS JOURNAL 142 (2023) . |
APA | Wei, Rongshan , Wei, Cong , Huang, Lijie , Huang, Gongxing , Wang, Renping , Hu, Wei . Low power capacitive-to-digital converter based on incremental delta-sigma modulator . | MICROELECTRONICS JOURNAL , 2023 , 142 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
This paper introduces a discrete-time delta-sigma ADC for the Internet of Things (IoT) applications. It utilizes second-order 4-bit successive approximation register (SAR) quantizer architecture based on the oversampling technique to ensure a sufficiently high SQNR. Additionally, dynamic weighted averaging (DWA) technique is employed to achieve good feedback CDAC linearity. System-level analysis and circuit implementation analysis are introduced in detail. The implemented prototype of this architecture is manufactured using a 180 nm CMOS process. The proposed ADC, operating at a supply voltage of 1.8 V and a sampling frequency of 2.5 MHz, including biasing circuitry, consumes a total power of 1.3 mW. This ADC achieves a DR of 102.6 dB, SNR of 101.5 dB, and SNDR of 98.6 dB within a 10 kHz bandwidth. As a result, the Schreier figure-of-merits (FoM) for SNR, SNDR and DR is 167.46 dB, 170.36 dB, and 171.46 dB.
Keyword :
Delta -sigma Delta -sigma Discrete -time Discrete -time Internet of Things (IoT) Internet of Things (IoT) SAR quantizer SAR quantizer
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wei, Cong , Chen, Chengying , Huang, Gongxing et al. A 1.8 V 98.6 dB SNDR discrete-time CMOS delta-sigma ADC [J]. | MICROELECTRONICS JOURNAL , 2023 , 144 . |
MLA | Wei, Cong et al. "A 1.8 V 98.6 dB SNDR discrete-time CMOS delta-sigma ADC" . | MICROELECTRONICS JOURNAL 144 (2023) . |
APA | Wei, Cong , Chen, Chengying , Huang, Gongxing , Huang, Lijie , Wang, Renping , Wei, Rongshan . A 1.8 V 98.6 dB SNDR discrete-time CMOS delta-sigma ADC . | MICROELECTRONICS JOURNAL , 2023 , 144 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
随着集成电路产业的不断发展,芯片的功耗日渐成为影响芯片性能的重要因素,设计合理的电源网络是解决芯片功耗问题的关键。本文以采用55 nm工艺的第五代精简指令集(reduced instruction set computing, RISC-V)处理器芯片为例,提出一种电源网络设计的思路,推导并设计了电源环和电源网格的结构、宽度、间距及电源IO的位置、数目等电源网络设计中的重要参数,并且在设计环节考虑了电源网络的各种常见问题。设计完成后用signoff阶段的验证工具测试了所设计电源网络的IR drop、电迁移等问题,证明了设计方法的合理性。
Keyword :
RISC-V处理器 RISC-V处理器 欧姆电压降 欧姆电压降 电源完整性分析 电源完整性分析 电源规划 电源规划 电源轨道分析 电源轨道分析 电迁移分析 电迁移分析
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 李溢祺 , 王仁平 . RISC-V处理器芯片的电源网络设计 [J]. | 贵州大学学报(自然科学版) , 2022 , 39 (04) : 54-59 . |
MLA | 李溢祺 et al. "RISC-V处理器芯片的电源网络设计" . | 贵州大学学报(自然科学版) 39 . 04 (2022) : 54-59 . |
APA | 李溢祺 , 王仁平 . RISC-V处理器芯片的电源网络设计 . | 贵州大学学报(自然科学版) , 2022 , 39 (04) , 54-59 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |