mingyin0312 / RLFromScratchLinks
☆463Updated 3 months ago
Alternatives and similar repositories for RLFromScratch
Users that are interested in RLFromScratch are comparing it to the libraries listed below
Sorting:
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆561Updated 2 months ago
- rl from zero pretrain, can it be done? yes.☆281Updated 2 months ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆313Updated last month
- Physics of Language Models, Part 4☆262Updated 4 months ago
- Exploring Applications of GRPO☆249Updated 3 months ago
- ☆224Updated last week
- Tina: Tiny Reasoning Models via LoRA☆309Updated 2 months ago
- An extension of the nanoGPT repository for training small MOE models.☆215Updated 8 months ago
- ☆917Updated last month
- dLLM: Simple Diffusion Language Modeling☆1,069Updated this week
- minimal GRPO implementation from scratch☆100Updated 8 months ago
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…☆289Updated 3 weeks ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆322Updated this week
- ☆344Updated this week
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆126Updated 6 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆360Updated 11 months ago
- ☆128Updated 2 weeks ago
- Minimal hackable GRPO implementation☆303Updated 10 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆223Updated last month
- Normalized Transformer (nGPT)☆194Updated last year
- Open-source framework for the research and development of foundation models.☆648Updated this week
- A Gym for Agentic LLMs☆371Updated 3 weeks ago
- ☆82Updated 4 months ago
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆212Updated 5 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆304Updated last month
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆118Updated 2 weeks ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆576Updated last month
- ☆403Updated 11 months ago
- [ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆382Updated 2 months ago
- Nano repo for RL training of LLMs☆70Updated last month