TY - GEN
T1 - Error bound analysis of policy iteration based approximate dynamic programming for deterministic discrete-time nonlinear systems
AU - Guo, Wentao
AU - Liu, Feng
AU - Si, Jennie
AU - Mei, Shengwei
AU - Li, Rui
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/9/28
Y1 - 2015/9/28
N2 - Extensive approximate dynamic programming (ADP) algorithms have been developed based on policy iteration. For policy iteration based ADP of deterministic discrete-time nonlinear systems, existing literature has proved its convergence in the formulation of undiscounted value function under the assumption of exact approximation. Furthermore, the error bound of policy iteration based ADP has been analyzed in a discounted value function formulation with consideration of approximation errors. However, there has not been any error bound analysis of policy iteration based ADP in the undiscounted value function formulation with consideration of approximation errors. In this paper, we intend to fill this theoretical gap. We provide a sufficient condition on the approximation error, so that the iterative value function can be bounded in a neighbourhood of the optimal value function. To the best of the authors' knowledge, this is the first error bound result of the undiscounted policy iteration for deterministic discrete-time nonlinear systems considering approximation errors.
AB - Extensive approximate dynamic programming (ADP) algorithms have been developed based on policy iteration. For policy iteration based ADP of deterministic discrete-time nonlinear systems, existing literature has proved its convergence in the formulation of undiscounted value function under the assumption of exact approximation. Furthermore, the error bound of policy iteration based ADP has been analyzed in a discounted value function formulation with consideration of approximation errors. However, there has not been any error bound analysis of policy iteration based ADP in the undiscounted value function formulation with consideration of approximation errors. In this paper, we intend to fill this theoretical gap. We provide a sufficient condition on the approximation error, so that the iterative value function can be bounded in a neighbourhood of the optimal value function. To the best of the authors' knowledge, this is the first error bound result of the undiscounted policy iteration for deterministic discrete-time nonlinear systems considering approximation errors.
KW - Approximation algorithms
KW - Approximation methods
KW - Mathematical model
UR - http://www.scopus.com/inward/record.url?scp=84951023469&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84951023469&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2015.7280783
DO - 10.1109/IJCNN.2015.7280783
M3 - Conference contribution
AN - SCOPUS:84951023469
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2015 International Joint Conference on Neural Networks, IJCNN 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - International Joint Conference on Neural Networks, IJCNN 2015
Y2 - 12 July 2015 through 17 July 2015
ER -