nph4rd / grpoLinks
simple grpo
☆12Updated 3 months ago
Alternatives and similar repositories for grpo
Users that are interested in grpo are comparing it to the libraries listed below
Sorting:
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- A puzzle to learn about prompting☆135Updated 2 years ago
- Highly commented implementations of Transformers in PyTorch☆136Updated 2 years ago
- ☆101Updated last week
- ☆89Updated last year
- ☆31Updated 10 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆52Updated 4 months ago
- git extension for {collaborative, communal, continual} model development☆216Updated 10 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆87Updated last year
- Resources from the EleutherAI Math Reading Group☆54Updated 7 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆159Updated last year
- ML/DL Math and Method notes☆63Updated last year
- PageRank for LLMs☆50Updated 2 weeks ago
- Easily run PyTorch on multiple GPUs & machines☆47Updated 3 months ago
- Benchmarking Agentic LLM and VLM Reasoning On Games☆193Updated last month
- ☆94Updated last year
- ☆55Updated last year
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆50Updated last week
- A library to create and manage configuration files, especially for machine learning projects.☆79Updated 3 years ago
- Repository for the paper Stream of Search: Learning to Search in Language☆151Updated 7 months ago
- Compiling useful links, papers, benchmarks, ideas, etc.☆45Updated 6 months ago
- ☆65Updated 10 months ago
- Train vision models using JAX and 🤗 transformers☆100Updated last week
- M4 experiment logbook☆58Updated 2 years ago
- ☆97Updated last month
- ☆49Updated 7 months ago
- A reading list of relevant papers and projects on foundation model annotation☆27Updated 7 months ago
- 🧠 Starter templates for doing interpretability research☆74Updated 2 years ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆197Updated last year
- ☆142Updated 2 weeks ago