Deep Reinforcement Learning Algorithms Library
One-file implementations of deep RL algorithms in PyTorch. Each algorithm is self-contained — readable, runnable, and stripped of unnecessary abstraction.
NeatRL Library
The neatrl/ package provides reusable training utilities built on top of the individual implementations. Install via pip:
pip install neatrl"[classic,box2d,atari]"
from neatrl import train_dqn
model = train_dqn(env_id="CartPole-v1", total_timesteps=10000, seed=42)
Full source: github.com/YuvrajSingh-mist/NeatRL/tree/master/neatrl
Implementations
Value-Based
- DQN — Deep Q-Network for CartPole and LunarLander
- DQN Atari — DQN with conv nets on Breakout
- DQN Flappy — DQN on Flappy Bird
- DQN Lunar — DQN tuned for Lunar Lander
- DQN Taxi — DQN for discrete Taxi-v3
- DQN FrozenLake — DQN on FrozenLake
- Dueling DQN — Separate value and advantage streams
- Q-Learning — Tabular Q-Learning and Value Iteration
- VizDoom RL — DQN in a 3D first-person environment
Policy-Based
- REINFORCE — Monte Carlo policy gradient
- A2C — Advantage Actor-Critic
- PPO — Proximal Policy Optimization
- FlappyBird PPO — PPO on Flappy Bird
- GRPO — Group Relative Policy Optimization (DeepSeek-R1)
Continuous Control
Exploration & Multi-Agent
- RND — Random Network Distillation + PPO
- Imitation Learning — Behavioral cloning
- MARL — Multi-Agent RL (IPPO, MAPPO, Self-Play)
References
- Sutton & Barto — Reinforcement Learning: An Introduction
- CleanRL — primary inspiration for the one-file style