Performance analysis of direct heuristic dynamic programming using control-theoretic measures

Lei Yang; Jennie Si; Konstantinos Tsakalis; Armando Rodriguez

doi:10.1109/IJCNN.2007.4371352

Performance analysis of direct heuristic dynamic programming using control-theoretic measures

Lei Yang, Jennie Si, Konstantinos Tsakalis, Armando Rodriguez

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Approximate dynamic programming (ADP) has been widely studied from several important perspectives: algorithm development, learning efficiency measured by success or failure statistics, convergence rate, and learning error bounds. Given that many learning benchmarks used in ADP or reinforcement learning studies are control problems, it is important and necessary to examine the learning controllers from a control-theoretic perspective. This paper makes use of direct heuristic dynamic programming (direct HDP) and several benchmark examples to introduce a unique analytical framework that can be extended to other learning control paradigms and other complex control problems. The sensitivity analysis and the linear quadratic regulator (LQR) design are used in the paper for two purposes: to gauge direct HDP performance characteristics and to provide guidance toward designing better learning controllers. This gauge however does not limit the direct HDP to be effective only as a linear controller. Toward this end, applications of the direct HDP for nonlinear control problems beyond sensitivity analysis and the confines of LQR have been developed and compared with LQR design for command following and internal system parameter changes.

Original language	English (US)
Title of host publication	The 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings
Pages	2504-2509
Number of pages	6
DOIs	https://doi.org/10.1109/IJCNN.2007.4371352
State	Published - 2007
Event	2007 International Joint Conference on Neural Networks, IJCNN 2007 - Orlando, FL, United States Duration: Aug 12 2007 → Aug 17 2007

Publication series

Name	IEEE International Conference on Neural Networks - Conference Proceedings
ISSN (Print)	1098-7576

Other

Other	2007 International Joint Conference on Neural Networks, IJCNN 2007
Country/Territory	United States
City	Orlando, FL
Period	8/12/07 → 8/17/07

ASJC Scopus subject areas

Software

Access to Document

10.1109/IJCNN.2007.4371352

Cite this

Yang, L., Si, J., Tsakalis, K., & Rodriguez, A. (2007). Performance analysis of direct heuristic dynamic programming using control-theoretic measures. In The 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings (pp. 2504-2509). Article 4371352 (IEEE International Conference on Neural Networks - Conference Proceedings). https://doi.org/10.1109/IJCNN.2007.4371352

Performance analysis of direct heuristic dynamic programming using control-theoretic measures. / Yang, Lei; Si, Jennie ; Tsakalis, Konstantinos et al.
The 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings. 2007. p. 2504-2509 4371352 (IEEE International Conference on Neural Networks - Conference Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Yang, L, Si, J , Tsakalis, K & Rodriguez, A 2007, Performance analysis of direct heuristic dynamic programming using control-theoretic measures. in The 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings., 4371352, IEEE International Conference on Neural Networks - Conference Proceedings, pp. 2504-2509, 2007 International Joint Conference on Neural Networks, IJCNN 2007, Orlando, FL, United States, 8/12/07. https://doi.org/10.1109/IJCNN.2007.4371352

@inproceedings{1a2421cf41d94e41ada49e38297b1944,

title = "Performance analysis of direct heuristic dynamic programming using control-theoretic measures",

abstract = "Approximate dynamic programming (ADP) has been widely studied from several important perspectives: algorithm development, learning efficiency measured by success or failure statistics, convergence rate, and learning error bounds. Given that many learning benchmarks used in ADP or reinforcement learning studies are control problems, it is important and necessary to examine the learning controllers from a control-theoretic perspective. This paper makes use of direct heuristic dynamic programming (direct HDP) and several benchmark examples to introduce a unique analytical framework that can be extended to other learning control paradigms and other complex control problems. The sensitivity analysis and the linear quadratic regulator (LQR) design are used in the paper for two purposes: to gauge direct HDP performance characteristics and to provide guidance toward designing better learning controllers. This gauge however does not limit the direct HDP to be effective only as a linear controller. Toward this end, applications of the direct HDP for nonlinear control problems beyond sensitivity analysis and the confines of LQR have been developed and compared with LQR design for command following and internal system parameter changes.",

author = "Lei Yang and Jennie Si and Konstantinos Tsakalis and Armando Rodriguez",

year = "2007",

doi = "10.1109/IJCNN.2007.4371352",

language = "English (US)",

isbn = "142441380X",

series = "IEEE International Conference on Neural Networks - Conference Proceedings",

pages = "2504--2509",

booktitle = "The 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings",

note = "2007 International Joint Conference on Neural Networks, IJCNN 2007 ; Conference date: 12-08-2007 Through 17-08-2007",

}

TY - GEN

T1 - Performance analysis of direct heuristic dynamic programming using control-theoretic measures

AU - Yang, Lei

AU - Si, Jennie

AU - Tsakalis, Konstantinos

AU - Rodriguez, Armando

PY - 2007

Y1 - 2007

N2 - Approximate dynamic programming (ADP) has been widely studied from several important perspectives: algorithm development, learning efficiency measured by success or failure statistics, convergence rate, and learning error bounds. Given that many learning benchmarks used in ADP or reinforcement learning studies are control problems, it is important and necessary to examine the learning controllers from a control-theoretic perspective. This paper makes use of direct heuristic dynamic programming (direct HDP) and several benchmark examples to introduce a unique analytical framework that can be extended to other learning control paradigms and other complex control problems. The sensitivity analysis and the linear quadratic regulator (LQR) design are used in the paper for two purposes: to gauge direct HDP performance characteristics and to provide guidance toward designing better learning controllers. This gauge however does not limit the direct HDP to be effective only as a linear controller. Toward this end, applications of the direct HDP for nonlinear control problems beyond sensitivity analysis and the confines of LQR have been developed and compared with LQR design for command following and internal system parameter changes.

AB - Approximate dynamic programming (ADP) has been widely studied from several important perspectives: algorithm development, learning efficiency measured by success or failure statistics, convergence rate, and learning error bounds. Given that many learning benchmarks used in ADP or reinforcement learning studies are control problems, it is important and necessary to examine the learning controllers from a control-theoretic perspective. This paper makes use of direct heuristic dynamic programming (direct HDP) and several benchmark examples to introduce a unique analytical framework that can be extended to other learning control paradigms and other complex control problems. The sensitivity analysis and the linear quadratic regulator (LQR) design are used in the paper for two purposes: to gauge direct HDP performance characteristics and to provide guidance toward designing better learning controllers. This gauge however does not limit the direct HDP to be effective only as a linear controller. Toward this end, applications of the direct HDP for nonlinear control problems beyond sensitivity analysis and the confines of LQR have been developed and compared with LQR design for command following and internal system parameter changes.

UR - http://www.scopus.com/inward/record.url?scp=51749117375&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51749117375&partnerID=8YFLogxK

U2 - 10.1109/IJCNN.2007.4371352

DO - 10.1109/IJCNN.2007.4371352

M3 - Conference contribution

AN - SCOPUS:51749117375

SN - 142441380X

SN - 9781424413805

T3 - IEEE International Conference on Neural Networks - Conference Proceedings

SP - 2504

EP - 2509

BT - The 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings

T2 - 2007 International Joint Conference on Neural Networks, IJCNN 2007

Y2 - 12 August 2007 through 17 August 2007

ER -

Performance analysis of direct heuristic dynamic programming using control-theoretic measures

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this