VALUE FUNCTION ESTIMATION BASED ON AN ERROR GAUSSIAN MIXTURE MODEL - Details

author：

Indexed by：

SCIE

Abstract：

In　reinforcement,　exploration　and　utilization　of　agents＇　action　selection　has　always　been　the　key　problem.　Agents　should　not　only　make　full　use　of　maximum　action,　but　also　explore　potential　optimal　action.　Inspired　by　the　exploration　and　utilization　of　actions　selection,　a　novel　value　function　exploration　algorithm　based　on　an　error　Gaussian　mixture　model　(EGMM)　is　proposed　in　this　paper.　First,　appropriate　variables　are　chosen　from　error　data,　and　the　number　of　Gaussian　components　are　obtained　by　optimizing　a　Bayesian　information　criterion　via　the　EGMM.　Then,　the　EGMM　is　used　for　the　fitting　and　calculation　of　error　data　to　obtain　the　conditional　error　mean　to　compensate　for　the　output,　thus　obtaining　more　accurate　results.　We　test　the　performance　of　the　designed　algorithm　via　a　virtual　experimental　platform　in　a　cloud　computing　environment.　Experiments　demonstrate　the　proposed　algorithm　eliminate　the　influence　of　non-Gaussian　noise　on　model　prediction　performance.

Keyword：

error Gaussian mixture model Gaussian process regression reinforcement learning Value function estimation

Community：

[ 1 ] [Cui, Delong]Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China
[ 2 ] [Peng, Zhiping]Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China
[ 3 ] [Li, Qirui]Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China
[ 4 ] [He, Jieguang]Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China
[ 5 ] [Li, Kaibin]Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China
[ 6 ] [Hung, Shangchao]Fuzhou Univ, Fuzhou Polytech, Fuzhou 350108, Fujian, Peoples R China
[ 7 ] [Hung, Shangchao]Intelligent Technol Res Ctr, Fuzhou 350108, Fujian, Peoples R China

Reprint 's Address：

[Peng, Zhiping]Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming 525000, Guangdong, Peoples R China

Email：

Show more details

Related Keywords：

Voltage Control Method of Distribution Network with Soft Open Point Based on Deep Reinforcement Learning
2024，High Voltage Engineering
Combining Model-Based Q-Learning With Structural Knowledge Transfer for Robot Skill Learning
2019，IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS
An Accurate Cavitation Prediction Thruster Model based on Gaussian Process Regression
2017，IEEE International Conference on Robotics and Biomimetics (ROBIO)
An accurate cavitation prediction thruster model based on Gaussian process regression
2017，2017 IEEE International Conference on Robotics and Biomimetics, ROBIO 2017

Source ：

JOURNAL OF NONLINEAR AND CONVEX ANALYSIS

ISSN： 1345-4773

Year： 2021

Issue： 9

Volume： 22

Page： 1687-1702

1 . 0 1 6

JCR@2021

0 . 7 0 0

JCR@2023

ESI Discipline： MATHEMATICS;

ESI HC Threshold：36

JCR Journal Grade：2

CAS Journal Grade：3

Cited Count：

WoS CC Cited Count： 1

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 5

Affiliated Colleges：

Get Fulltext

Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to