https://github.com/YantianZha/SERLfD
Affordance-Aware Imitation Learning,
Zha et al., IROS, 2022
Coarse-to-Fine Imitation Learning,
Edward Johns, ICRA, 2021
One-Shot Imitation Learning,
Yu et al., RSS, 2018
✅ Demonstrations provide a robust learning signal,
contributing to sample-efficient learning
✅ Demonstrations provide a robust learning signal,
contributing to sample-efficient learning
❌ 1) Distribution drifting issues;
2) Learners cannot outperform demonstrators;
3) High dataset collection costs
Reinforcement Learning
✅ Learners could outperform teachers;
More robust to distribution-drifting; No need of demonstrations
✅ Learners could outperform teachers;
More robust to distribution-drifting; No need of demonstrations
❌ Learning is not sample-efficient (especially in sparse-reward environments)
RL + LfD?
✅ Combine the benefits of RL and LfD – making RL more sample-efficient
✅ Combine the benefits of RL and LfD – making RL more sample-efficient
❌ Still inefficient to handle ambiguity in demonstrations and environments
Humans instinctively self-explain experiences, covering problem-solving, mistakes, and the actions and outcomes of others
Self-Explanation:
object-location > object-color
Self-Explanation:
object-location < object-color
Robot-Push-Simple
6 Predicates
Continuous Action Space
Robot-Push
10 Predicates
Continuous Action Space
Robot-Remove-and-Push
20 Predicates
Continuous Action Space
Pacman
2 Predicates
Discrete Action Space