Statistical Reinforcement Learning, Modern Machine, Learning Approaches, Sugiyama M., 2015

Statistical Reinforcement Learning, Modern Machine, Learning Approaches, Sugiyama M., 2015.

   This series reflects the latest advances and applications in machine learning and pattern recognition through the publication of a broad range of reference works, textbooks, and handbooks. The inclusion of concrete examples, applications, and methods is highly encouraged. The scope of the series includes, but is not limited to, titles in the areas of machine learning, pattern recognition, computational intelligence, robotics, computational/statistical learning theory, natural language processing, computer vision, game AI, game theory, neural networks, computational neuroscience, and other relevant topics, such as machine learning applied to bioinformatics or cognitive science, which might be proposed by potential contributors.

Statistical Reinforcement Learning, Modern Machine, Learning Approaches, Sugiyama M., 2015


Model-Based Reinforcement Learning.
In the above model-free approaches, policies are learned without explicitly modeling the unknown environment (i.e., the transition probability of the agent in the environment, р(s'|s,а)). On the other hand, the model-based approach explicitly learns the environment in advance and uses the learned environment model for policy learning.

No additional sampling cost is necessary to generate artificial samples from the learned environment model. Thus, the model-based approach is particularly useful when data collection is expensive (e.g., robot control). However, accurately estimating the transition model from a limited amount of trajectory data in multi-dimensional continuous state and action spaces is highly challenging. Part IV of this book focuses on model-based reinforcement learning. In Chapter 10, a non-parametric transition model estimator that possesses the optimal convergence rate with high computational efficiency is introduced. However, even with the optimal convergence rate, estimating the transition model in high-dimensional state and action spaces is still challenging. In Chapter 11, a dimensionality reduction method that can be efficiently embedded into the transition model estimation procedure is introduced and its usefulness is demonstrated through experiments.

Contents.
Foreword.
Preface.
Author.
I Introduction.
1 Introduction to Reinforcement Learning.
1.1 Reinforcement Learning.
1.2 Mathematical Formulation.
1.3 Structure of the Book.
1.3.1 Model-Free Policy Iteration.
1.3.2 Model-Free Policy Search.
1.3.3 Model-Based Reinforcement Learning.
II Model-Free Policy Iteration.
2 Policy Iteration with Value Function Approximation.
2.1 Value Functions.
2.1.1 State Value Functions.
2.1.2 State-Action Value Functions.
2.2 Least-Squares Policy Iteration.
2.2.1 Immediate-Reward Regression.
2.2.2 Algorithm.
2.2.3 Regularization.
2.2.4 Model Selection.
2.3 Remarks.
3 Basis Design for Value Function Approximation.
3.1 Gaussian Kernels on Graphs.
3.1.1 MDP-Induced Graph.
3.1.2 Ordinary Gaussian Kernels.
3.1.3 Geodesic Gaussian Kernels.
3.1.4 Extension to Continuous State Spaces.
3.2 Illustration.
3.2.1 Setup.
3.2.2 Geodesic Gaussian Kernels.
3.2.3 Ordinary Gaussian Kernels.
3.2.4 Graph-Laplacian Eigenbases.
3.2.5 Diffusion Wavelets.
3.3 Numerical Examples.
3.3.1 Robot-Arm Control.
3.3.2 Robot-Agent Navigation.
3.4 Remarks.
4 Sample Reuse in Policy Iteration.
4.1 Formulation.
4.2 Off-Policy Value Function Approximation.
4.2.1 Episodic Importance Weighting.
4.2.2 Per-Decision Importance Weighting.
4.2.3 Adaptive Per-Decision Importance Weighting.
4.2.4 Illustration.
4.3 Automatic Selection of Flattening Parameter.
4.3.1 Importance-Weighted Cross-Validation.
4.3.2 Illustration.
4.4 Sample-Reuse Policy Iteration.
4.4.1 Algorithm.
4.4.2 Illustration.
4.5 Numerical Examples.
4.5.1 Inverted Pendulum.
4.5.2 Mountain Car.
4.6 Remarks.
5 Active Learning in Policy Iteration.
5.1 Efficient Exploration with Active Learning.
5.1.1 Problem Setup.
5.1.2 Decomposition of Generalization Error.
5.1.3 Estimation of Generalization Error.
5.1.4 Designing Sampling Policies.
5.1.5 Illustration.
5.2 Active Policy Iteration.
5.2.1 Sample-Reuse Policy Iteration with Active Learning.
5.2.2 Illustration.
5.3 Numerical Examples.
5.4 Remarks.
6 Robust Policy Iteration.
6.1 Robustness and Reliability in Policy Iteration.
6.1.1 Robustness.
6.1.2 Reliability.
6.2 Least Absolute Policy Iteration.
6.2.1 Algorithm.
6.2.2 Illustration.
6.2.3 Properties.
6.3 Numerical Examples.
6.4 Possible Extensions.
6.4.1 Huber Loss.
6.4.2 Pinball Loss.
6.4.3 Deadzone-Linear Loss.
6.4.4 Chebyshev Approximation.
6.4.5 Conditional Value-At-Risk.
6.5 Remarks.
III Model-Free Policy Search.
7 Direct Policy Search by Gradient Ascent.
7.1 Formulation.
7.2 Gradient Approach.
7.2.1 Gradient Ascent.
7.2.2 Baseline Subtraction for Variance Reduction.
7.2.3 Variance Analysis of Gradient Estimators.
7.3 Natural Gradient Approach.
7.3.1 Natural Gradient Ascent.
7.3.2 Illustration.
7.4 Application in Computer Graphics: Artist Agent.
7.4.1 Sumie Painting.
7.4.2 Design of States, Actions, and Immediate Rewards.
7.4.3 Experimental Results.
7.5 Remarks.
8 Direct Policy Search by Expectation-Maximization.
8.1 Expectation-Maximization Approach.
8.2 Sample Reuse.
8.2.1 Episodic Importance Weighting.
8.2.2 Per-Decision Importance Weight.
8.2.3 Adaptive Per-Decision Importance Weighting.
8.2.4 Automatic Selection of Flattening Parameter.
8.2.5 Reward-Weighted Regression with Sample Reuse.
8.3 Numerical Examples.
8.4 Remarks.
9 Policy-Prior Search.
9.1 Formulation.
9.2 Policy Gradients with Parameter-Based Exploration.
9.2.1 Policy-Prior Gradient Ascent.
9.2.2 Baseline Subtraction for Variance Reduction.
9.2.3 Variance Analysis of Gradient Estimators.
9.2.4 Numerical Examples.
9.3 Sample Reuse in Policy-Prior Search.
9.3.1 Importance Weighting.
9.3.2 Variance Reduction by Baseline Subtraction.
9.3.3 Numerical Examples.
9.4 Remarks.
IV Model-Based Reinforcement Learning.
10 Transition Model Estimation.
10.1 Conditional Density Estimation.
10.1.1 Regression-Based Approach.
10.1.2 ǫ-Neighbor Kernel Density Estimation.
10.1.3 Least-Squares Conditional Density Estimation.
10.2 Model-Based Reinforcement Learning.
10.3 Numerical Examples.
10.3.1 Continuous Chain Walk.
10.3.2 Humanoid Robot Control.
10.4 Remarks.
11 Dimensionality Reduction for Transition Model Estimation.
11.1 Sufficient Dimensionality Reduction.
11.2 Squared-Loss Conditional Entropy.
11.2.1 Conditional Independence.
11.2.2 Dimensionality Reduction with SCE.
11.2.3 Relation to Squared-Loss Mutual Information.
11.3 Numerical Examples.
11.3.1 Artificial and Benchmark Datasets.
11.3.2 Humanoid Robot.
11.4 Remarks.
References.
Index.



Бесплатно скачать электронную книгу в удобном формате, смотреть и читать:
Скачать книгу Statistical Reinforcement Learning, Modern Machine, Learning Approaches, Sugiyama M., 2015 - fileskachat.com, быстрое и бесплатное скачивание.

Скачать pdf
Ниже можно купить эту книгу по лучшей цене со скидкой с доставкой по всей России.Купить эту книгу



Скачать - pdf - Яндекс.Диск.
Дата публикации:





Теги: :: ::


Следующие учебники и книги:
Предыдущие статьи:


 


 

Книги, учебники, обучение по разделам




Не нашёл? Найди:





2024-12-21 17:06:50