Brayden Zhang

      • International Geography Olympiad 2023
        • HOME - Machine Learning
        • Decoder-Only Transformers
        • Model Context Protocol (MCP)
        • LSTM
        • Recurrent Neural Networks
        • Transformers
        • Diffusion Models
        • UNet
        • CLIP (Contrastive Language-Image Pretraining)
        • Cross-Validation
        • Hyperparameter Tuning
        • R-CNN
        • Self-Supervised Learning
        • Seq2Seq
        • Gaussian Splatting
        • NeRF
        • Gradient Descent
        • ResNet
        • Regularization
        • A Primer on Probability
        • HOME - Deep Reinforcement Learning
        • Reinforcement Learning from Human Feedback
        • Policy Gradient
        • Actor-Critic Methods
        • Multi-Agent Reinforcement Learning
        • Proximal Policy Optimization (PPO)
        • Q-Learning
        • The Ingredients of RL
          • Foundation Models for Robotics
          • Sim2Real
          • Imitation Learning
        • HOME - Robotics
        • Untitled
      • Predicting Food Deserts - Citadel Invitational Datathon
      • Building a SUMO Robot from Scratch
      • Solar PV Power Forecasting using Deep Learning
      • International Young Physicists' Tournament (IYPT)
        • Courses I've Taken in University
    Home

    ❯

    Notes

    ❯

    Reinforcement Learning

    ❯

    HOME - Deep Reinforcement Learning

    HOME - Deep Reinforcement Learning

    Sep 01, 20251 min read

    • robotics
    • tutorial

    Notes on Deep RL, including the different classes of algorithms, implementation details in code, and applications to robotics, games, etc. Heavily based on the Deep RL course provided by Hugging Face.

    Table of Contents

    1. The Ingredients of RL
    2. Q-Learning
    3. Policy Gradient
      1. Actor-Critic Methods
      2. Proximal Policy Optimization (PPO)
    4. Offline RL

    Applications

    Reinforcement Learning from Human Feedback

    References

    Hugging Face Deep RL Course

    CS 285 Berkeley


    • Table of Contents
    • Applications
    • References

    Graph View

    Backlinks

    • HOME - Robotics
    • Brayden Zhang 😀

    Quartz v4.4.0 © 2025

    • Homepage
    • GitHub
    • LinkedIn
    • Twitter