nph4rd / grpoLinks
simple grpo
☆12Updated 6 months ago
Alternatives and similar repositories for grpo
Users that are interested in grpo are comparing it to the libraries listed below
Sorting:
- A puzzle to learn about prompting☆135Updated 2 years ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- ☆91Updated last year
- ☆115Updated 2 weeks ago
- Highly commented implementations of Transformers in PyTorch☆139Updated 2 years ago
- ☆10Updated last year
- Open Character Training☆59Updated 3 weeks ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆160Updated last year
- Compiling useful links, papers, benchmarks, ideas, etc.☆45Updated 9 months ago
- ☆31Updated last year
- ☆72Updated last year
- ☆104Updated 4 months ago
- ☆47Updated 6 months ago
- A reading list of relevant papers and projects on foundation model annotation☆28Updated 9 months ago
- Losslessly encode text natively with arithmetic coding and HuggingFace Transformers☆76Updated last month
- Benchmarking Agentic LLM and VLM Reasoning On Games☆217Updated 2 weeks ago
- ☆128Updated last week
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆328Updated last month
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆63Updated 7 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆152Updated 10 months ago
- ☆94Updated 2 years ago
- Implementation of Direct Preference Optimization☆17Updated 2 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆180Updated 5 months ago
- σ-GPT: A New Approach to Autoregressive Models☆70Updated last year
- MoE training for Me and You and maybe other people☆239Updated this week
- ☆56Updated last year
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆121Updated 2 months ago
- An introduction to LLM Sampling☆79Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆66Updated last month
- M4 experiment logbook☆58Updated 2 years ago