Fleer, Sascha: Scaffolding for learning from reinforcement: Improving interaction learning. 2020
Inhalt
- Abstract
- Acknowledgement
- Declaration
- Contents
- 1 Introduction
- I The toolbox
- 2 Reinforcement learning — a paradigm of human inspired artificial intelligence
- 2.1 The basic description of a reinforcement learning problem
- 2.2 The Bellman equation
- 2.3 Optimal value functions
- 2.4 Q-learning using linear function approximators
- 2.5 Policy gradient methods
- 2.6 Going deeper with neural networks
- 2.7 Summary
- 3 The guiding principle of scaffolding
- 3.1 The concept of scaffolding in educational psychology
- 3.2 Teaching devices: employing computer-based tools for scaffolding the learning process of humans
- 3.3 Scaffolding artificial agents by organizing learning on a meta-level
- 3.3.1 Recruiting and maintaining the learner's attention
- 3.3.2 Simplifying the task
- 3.3.3 Modelling and demonstration
- 3.3.4 Ongoing diagnosis and assessment
- 3.3.5 Fading support and eventual transfer of responsibility
- 3.3.6 Summary
- 3.4 Reformulating scaffolding as a principle for guiding the learning process of machines
- 3.5 A research map for scaffolding in machine learning
- 3.6 Summary
- II Scaffolding: a universal approach for fostering the learning process
- 4 Scaffolding attention control by exploiting ``perceptive acting''
- 4.1 The concept of entropy and mutual information in the context of reinforcement learning
- 4.2 Applying the concept to complex environments
- 4.2.1 Estimating the probability distribution of state transitions
- 4.2.2 Estimating the entropy & mutual information
- 4.3 Summary
- 5 Scaffolding attention control by exploiting ``active visual perception''
- 6 Scaffolding the learning of efficient haptic exploration using ``active haptic perception''
- 7 Scaffolding the agent's internal representation through skill transfer
- III Facilitating the learning process of interaction problems: testing the proposed scaffolding approaches
- 8 A learning domain for mediated interaction
- 8.1 The general design concept of the simulation world
- 8.2 Realization of a 2D simulation world with simplified physics
- 8.3 Perceiving & acting: defining a suitable state and action space for multi-object interaction scenarios
- 8.4 Designing suitable learning scenarios
- 8.5 Learning with a distance-related sensory input
- 8.6 Summary
- 9 A first scaffold for learning the ``Extension-of-Reach Scenario'': determining the best action set
- 10 A second scaffold for learning the ``Extension-of-Reach Scenario'': structuring the learning process
- 11 Scaffolding the learning process through ``active visual perception'': an attention based approach
- 12 A scaffold for enabling ``active haptic perception'': learning efficient haptic exploration
- IV Conclusion
- 13 Summary, conclusion & outlook
- 13.1 Four scaffolding approaches — a summary
- 13.2 Conclusion
- 13.3 Recommendations for future research
- Bibliography
- Appendices
- A Pseudocode
- B Used learning parameters
- B.1 Linear Q-learning
- B.1.1 State representations
- B.1.2 Learning the ``Extension-of-Reach Scenario'' using different coordinate systems
- B.2 Deep Q-learning
- B.3 Recurrent attention advantage actor-critic model
- B.4 Haptic attention model
- C Floating Myrmex sensor: experimental results
- D Supplementary material
