Contents1 Introduction 11.1 Optimal Control 11.1.1 Continuous-Time LQR 11.1.2 Discrete-Time LQR 21.2 Adaptive Dynamic Programming 31.3 Review of Matrix Algebra 5References 62 Neural-Network-BasedApproach for Finite-TimeOptimal Control 72.1 Introduction 72.2 Problem Formulation and Motivation 92.3 The Data-Based Identifier 92.4 Derivation of the Iterative ADP Algorithm with Convergence Analysis 112.5 Neural Network Implementation of theIterative Control Algorithm 172.6 Simulation Study 182.7 Conclusion 20References 223 Nearly Finite-HorizonOptimalControlfor Nonafiine Time-Delay Nonlinear Systems 253.1 Introduction 253.2 Problem Statement 263.3 The Iteration ADP Algorithm and ItsConvergence 303.3.1 The Novel ADP Iteration Algorithm 303.3.2 Convergence Analysis of the Improved Iteration Algorithm 333.3.3 Neural Network Implementation of the Iteration ADP Algorithm 383.4 Simulation Study 403.5 Conclusion 48References 484 Multi-objective Optimal Control for Time-Delay Systems 494.1 Introduction 494.2 Problem Formulation 504.3 Derivation of the ADP Algorithm for Time-Delay Systems 514.4 Neural Network Implementation for the Multi-objective Optimal Control Problem of Time-Delay Systems 544.5 Simulation Study 554.6 Conclusion 61References 625 Multiple Actor-Critic Optimal Control via ADP 635.1 Introduction 635.2 Problem Statement 655.3 SIANN Architecture-Based Classification 665.4 Optimal Control Based on ADP 695.4.1 Model Neural Network 705.4.2 Critic Network and Action Network 745.5 Simulation Study 825.6 Conclusion 91References 916 Optimal Control for a Class of Complex-Valued Nonlinear Systems 956.1 Introduction 956.2 Motivations and Preliminaries 966.3 ADP-Based Optimal Control Design 996.3.1 Critic Network 996.3.2 Action Network. 1016.3.3 Design of the Compensation Controller 1026.3.4 Stability Analysis 1036.4 Simulation Study 1076.5 Conclusion. 110References 1107 Off-Policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems 1137.1 Introduction 1137.2 Problem Statement 1147.3 Off-Policy Optimal Control Method 1157.3.1 Convergence Analysis of Off-Policy PI Algorithm 1177.3.2 Implementation Method of Off-Policy Iteration Algorithm 1197.3.3 Implementation Process 1227.4 Simulation Study 1227.5 Conclusion 125References 1258 Approximation-Error-ADP-Based Optimal Tracking Control for Chaotic Systems 1278.1 Introduction 1278.2 Problem Formulation and Preliminaries 1288.3 Optimal Tracking Control Scheme Basedon Approximation-Error ADP Algorithm 1308.3.1 Description of Approximation-Error ADP Algorithm 1308.3.2 Convergence Analysis of the Iterative ADP Algorithm 1328.4 Simulation Study 1368.5 Conclusion 144References 1449 Off-Policy Actor-Critic Structure for Optimal Controlof Unknown Systems with Disturbances 1479.1 Introduction 1479.2 Problem Statement 1489.3 Off-Policy Actor-Critic Integral Reinforcement Learning 1519.3.1 On-Policy IRL for Nonzero Disturbance 1519.3.2 Off-Policy IRL for Nonzero Disturbance 1529.3.3 NN Approximation for Actor-Critic Structure 1549.4 Disturbance Compensation Redesign andStability Analysis 1579.4.1 Disturbance Compensation Off-Policy Controller Design 1579.4.2 Stability Analysis 1589.5 Simulation Study 1619.6 Conclusion 163References 16310 An Iterative ADP Method to Solve for a Class of Nonlinear Zero-Sum DifferentialGames 16510.1 Introduction 16510.2 Preliminaries and Assumptions 16610.3 Iterative Approximate Dynamic Programming Method for ZS Differential Games 16910.3.1 Derivation of the Iterative ADP Method 16910.3.2 The Procedure of theMethod 17410.3.3 The Properties of theIterativeADP Method 17610.4 Neural Network Implementation 19010.4.1 The Model Network 19110.4.2 The Critic Network 19210.4.3 The Action Network 19310.5 Simulation Study 19510.6 Conclusion 204References 20411 Neural-Network-Based Synchronous Iteration Learning Method for Multi-player Zero-Sum Games 20711.1 Introduction 20711.2 Motivations and Preliminaries 20811.3 Synchronous Solution of Multi-playerZSGames 21311.3.1 Derivation of Off-Policy Algorithm 21311.3.2 Implementation Method for Off-Policy Algorithm 21411.3.3 Stability Analysis 21811.4 Simulation Study 21911.5 Conclusion 224References 22412 Off-Policy Integral Reinforcement Learning Method for Multi-player Non-Zero-Sum Games 22712.1 Introduction 22712.2 Problem Statement 22812.3 Multi-player Learning PI SolutionforNZSGames 22912.4 Off-Policy Integral ReinforcementLearningMethod 23412.4.1 Derivation of Off-Policy Algorithm 23412.4.2 Implementation Method for Off-Policy Algorith