• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Tang, Wanxing (Tang, Wanxing.) [1] | Cheng, Chuang (Cheng, Chuang.) [2] | Ai, Haiping (Ai, Haiping.) [3] | Chen, Li (Chen, Li.) [4]

Indexed by:

EI

Abstract:

In this article, the trajectory planning of the two manipulators of the dual-arm robot is studied to approach the patient in a complex environment with deep reinforcement learning algorithms. The shape of the human body and bed is complex which may lead to the collision between the human and the robot. Because the sparse reward the robot obtains from the environment may not support the robot to accomplish the task, a neural network is trained to control the manipulators of the robot to prepare to hold the patient up by using a proximal policy optimization algorithm with a continuous reward function. Firstly, considering the realistic scene, the 3D simulation environment is built to conduct the research. Secondly, inspired by the idea of the artificial potential field, a new reward and punishment function was proposed to help the robot obtain enough rewards to explore the environment. The function is consisting of four parts which include the reward guidance function, collision detection, obstacle avoidance function, and time function. Where the reward guidance function is used to guide the robot to approach the targets to hold the patient, the collision detection and obstacle avoidance function are complementary to each other and are used to avoid obstacles, and the time function is used to reduce the number of training episode. Finally, after the robot is trained to reach the targets, the training results are analyzed. Compared with the DDPG algorithm, the PPO algorithm reduces about 4 million steps for training to converge. Moreover, compared with the other reward and punishment functions, the function used in this paper will obtain many more rewards at the same training time. Apart from that, it will take much less time to converge, and the episode length will be shorter; so, the advantage of the algorithm used in this paper is verified. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.

Keyword:

Air navigation Complex networks Deep learning Learning algorithms Manipulators Reinforcement learning Robotic arms Robot programming Trajectories

Community:

  • [ 1 ] [Tang, Wanxing]School of Energy and Mechanical Engineering, Jiangxi University of Science and Technology, Nanchang; 330013, China
  • [ 2 ] [Tang, Wanxing]College of Mechanical Engineering, Fuzhou University, Fuzhou; 350002, China
  • [ 3 ] [Cheng, Chuang]College of Intelligence Science and Technology, National University of Defense Technology, Changsha; 410073, China
  • [ 4 ] [Ai, Haiping]School of Energy and Mechanical Engineering, Jiangxi University of Science and Technology, Nanchang; 330013, China
  • [ 5 ] [Chen, Li]College of Mechanical Engineering, Fuzhou University, Fuzhou; 350002, China

Reprint 's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Micromachines

Year: 2022

Issue: 4

Volume: 13

3 . 4

JCR@2022

3 . 0 0 0

JCR@2023

ESI HC Threshold:66

JCR Journal Grade:2

CAS Journal Grade:3

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 9

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 2

Affiliated Colleges:

Online/Total:86/11138725
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1