Train in simulation, deploy in real world (with real-time adaptation)

Why simulators for robot learning?

  • most RL-based algos are very sample inefficient
  • They are cheap/fast/scalable

Problems of Sim2Real

  • non-parametric mismatches (simulator doesn’t consider some effects at all)
    • complex aerodynamics, fluid dynamics, tire dynamics, etc
  • Parametric mismatches (simulator uses different parameters than real)
    • robot mass/friction,etc

Domain Randomization

  • Randomize in
  • Train a single RL policy that works for many
    • Approximation of robust control

Learning to Adapt

  • Randomize in
  • Train an adaptive RL policy that works for many
    • approximation of adaptive control
  • Issue! is often unknown in real
    • Solution! Learning from a privileged teacher
      • Sim: First Train a teacher policy with privileged information
      • Sim: Student policy learns from
      • Real: Deploy student policy
    • Basically an Imitation Learning problem