PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research papers.
☆208Aug 14, 2025Updated 9 months ago
Alternatives and similar repositories for Reinforcement-Learning
Users that are interested in Reinforcement-Learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Aug 13, 2025Updated 9 months ago
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆17Mar 26, 2025Updated last year
- Reinforcement learning library with support for pcsx2 and opengl, among other cores.☆61Updated this week
- The core repository of the elsciRL framework.☆18Dec 8, 2025Updated 5 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆19Apr 11, 2025Updated last year
- robot foundation models☆30Mar 23, 2025Updated last year
- Code for "Baba Is AI: Break the Rules to Beat the Benchmark"☆46Sep 3, 2025Updated 8 months ago
- Implementation of all RL algorithms in a simpler way☆1,564Aug 29, 2025Updated 8 months ago
- High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, T…☆9,785Apr 20, 2026Updated last month
- The official code release for Q#: Provably Optimal Distributional RL for LLM Post-Training☆20Mar 4, 2025Updated last year
- An open source deep learning library for Unity.☆17May 16, 2026Updated last week
- IIT Guwahati's Gold Medal winning solution to DevRev’s Expert Answers in a Flash Improving Domain-Specific QA☆11Jul 26, 2025Updated 10 months ago
- ☆36Jul 8, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Collection of resources for the frobs_rl package.☆61Oct 8, 2024Updated last year
- Official repository of the spotlight ICML 2025 paper, PokeChamp: an Expert-level Minimax Language Agent.☆154Mar 11, 2026Updated 2 months ago
- ☆15Apr 29, 2024Updated 2 years ago
- A list of companies focusing on geospatial intelligence, GIS, RS, Climate risks, and more☆21Jul 29, 2025Updated 9 months ago
- Linux distribution for space-grade robotics on the BeagleV-Fire RISC-V platform + FPGA support☆21Dec 24, 2025Updated 5 months ago
- Automatic Thief Detection via CCTV with Alarm System and Perpetrator Image Capture using YOLOv5 + ROI. This project utilizes computer vis…☆19Oct 21, 2024Updated last year
- A collection of sophisticated computer vision and machine learning problems for graduate-level researchers and practitioners☆40Jun 13, 2025Updated 11 months ago
- This project provides a set of translators to convert OpenAI Gym environments into text-based environments. It is designed to investigate…☆21May 29, 2024Updated last year
- all code examples in the blog posts☆21Jan 27, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Controlling servos via UDP and motor with ESP32 and Godot☆11Jun 20, 2021Updated 4 years ago
- A Gymnasium Environment for the Job Shop Problem Using the Disjunctive Graph Approach.☆28May 4, 2026Updated 3 weeks ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆17Apr 22, 2025Updated last year
- A locally trained model of Stoney Nakoda has been developed and released. You can access the working model here or train your own instanc…☆10May 11, 2026Updated 2 weeks ago
- Implementation of the paper on Embodiment Scaling Laws in Robot Locomotion (CoRL 2025)☆26Sep 23, 2025Updated 8 months ago
- 🚀全流程自己训练一个VLA 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!☆33Oct 16, 2025Updated 7 months ago
- [ICML 2025] Repository for M3-JEPA: Multimodal Alignment via Multi-gate MoE based on the Joint-Predictive Embedding Architecture☆29Mar 13, 2026Updated 2 months ago
- Jax-Baseline is a Reinforcement Learning implementation using JAX and Flax/Haiku libraries, mirroring the functionality of Stable-Baselin…☆65Jan 2, 2026Updated 4 months ago
- Reinforcement learning framework.☆17Jul 25, 2025Updated 10 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Computation of binomial confidence intervals that achieve exact coverage.☆16Apr 23, 2025Updated last year
- zero-code hyperparameters optimization framework☆14Jan 25, 2024Updated 2 years ago
- Code for the paper "Function-Space Learning Rates"☆24Jun 3, 2025Updated 11 months ago
- Fine tune Gemma 3 on an object detection task☆106Jul 14, 2025Updated 10 months ago
- Course content from ACM AI's Winter 2022 iteration of beginner track workshops.☆16Mar 18, 2022Updated 4 years ago
- Final Year Project , Imperial College London☆13Apr 1, 2019Updated 7 years ago
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year