Deep Reinforcement Learning Algorithms Library

One-file implementations of deep RL algorithms in PyTorch. Each algorithm is self-contained — readable, runnable, and stripped of unnecessary abstraction.

NeatRL Library

The neatrl/ package provides reusable training utilities built on top of the individual implementations. Install via pip:

pip install neatrl"[classic,box2d,atari]"
from neatrl import train_dqn

model = train_dqn(env_id="CartPole-v1", total_timesteps=10000, seed=42)

Full source: github.com/YuvrajSingh-mist/NeatRL/tree/master/neatrl

Implementations

Value-Based

Policy-Based

  • REINFORCE — Monte Carlo policy gradient
  • A2C — Advantage Actor-Critic
  • PPO — Proximal Policy Optimization
  • FlappyBird PPO — PPO on Flappy Bird
  • GRPO — Group Relative Policy Optimization (DeepSeek-R1)

Continuous Control

  • DDPG — Deep Deterministic Policy Gradient
  • TD3 — Twin Delayed DDPG
  • SAC — Soft Actor-Critic

Exploration & Multi-Agent

  • RND — Random Network Distillation + PPO
  • Imitation Learning — Behavioral cloning
  • MARL — Multi-Agent RL (IPPO, MAPPO, Self-Play)

References