Direct heuristic dynamic programming based on an improved PID neural network

Jian Sun; Feng Liu; Jennie Si; Shengwei Mei

doi:10.1007/s11768-012-0112-0

Direct heuristic dynamic programming based on an improved PID neural network

Jian Sun, Feng Liu, Jennie Si, Shengwei Mei

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of approximate dynamic programming (ADP), DHDP has demonstrated its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control. Since this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Some important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm.

Original language	English (US)
Pages (from-to)	497-503
Number of pages	7
Journal	Journal of Control Theory and Applications
Volume	10
Issue number	4
DOIs	https://doi.org/10.1007/s11768-012-0112-0
State	Published - Nov 2012

Keywords

Approximate dynamic programming (ADP)
Direct heuristic dynamic programming (DHDP)
Improved PID neural network (IPIDNN)

ASJC Scopus subject areas

Control and Systems Engineering
Hardware and Architecture
Computer Science Applications

Access to Document

10.1007/s11768-012-0112-0

Cite this

@article{47dd3b605eac44b2aeaa909b28ab6b46,

title = "Direct heuristic dynamic programming based on an improved PID neural network",

abstract = "In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of approximate dynamic programming (ADP), DHDP has demonstrated its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control. Since this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Some important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm.",

keywords = "Approximate dynamic programming (ADP), Direct heuristic dynamic programming (DHDP), Improved PID neural network (IPIDNN)",

author = "Jian Sun and Feng Liu and Jennie Si and Shengwei Mei",

note = "Funding Information: Received 11 May 2010; revised 28 March 2011. This work was supported by the National Natural Science Foundation of China under Cooperative Research Funds (No. 50828701), and the third author is also supported by the U.S. Natural Science Foundation (No. ECCS-0702057). {\textcopyright}c South China University of Technology and Academy of Mathematics and Systems Science, CAS and Springer-Verlag Berlin Heidelberg 2012",

year = "2012",

month = nov,

doi = "10.1007/s11768-012-0112-0",

language = "English (US)",

volume = "10",

pages = "497--503",

journal = "Journal of Control Theory and Applications",

issn = "1672-6340",

publisher = "Springer Science + Business Media",

number = "4",

}

TY - JOUR

T1 - Direct heuristic dynamic programming based on an improved PID neural network

AU - Sun, Jian

AU - Liu, Feng

AU - Si, Jennie

AU - Mei, Shengwei

N1 - Funding Information: Received 11 May 2010; revised 28 March 2011. This work was supported by the National Natural Science Foundation of China under Cooperative Research Funds (No. 50828701), and the third author is also supported by the U.S. Natural Science Foundation (No. ECCS-0702057). ©c South China University of Technology and Academy of Mathematics and Systems Science, CAS and Springer-Verlag Berlin Heidelberg 2012

PY - 2012/11

Y1 - 2012/11

N2 - In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of approximate dynamic programming (ADP), DHDP has demonstrated its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control. Since this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Some important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm.

AB - In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of approximate dynamic programming (ADP), DHDP has demonstrated its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control. Since this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Some important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm.

KW - Approximate dynamic programming (ADP)

KW - Direct heuristic dynamic programming (DHDP)

KW - Improved PID neural network (IPIDNN)

UR - http://www.scopus.com/inward/record.url?scp=84868325663&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84868325663&partnerID=8YFLogxK

U2 - 10.1007/s11768-012-0112-0

DO - 10.1007/s11768-012-0112-0

M3 - Article

AN - SCOPUS:84868325663

SN - 1672-6340

VL - 10

SP - 497

EP - 503

JO - Journal of Control Theory and Applications

JF - Journal of Control Theory and Applications

IS - 4

ER -

Direct heuristic dynamic programming based on an improved PID neural network

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this