nph4rd / grpoLinks
simple grpo
☆12Updated 8 months ago
Alternatives and similar repositories for grpo
Users that are interested in grpo are comparing it to the libraries listed below
Sorting:
- ☆10Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- Highly commented implementations of Transformers in PyTorch☆138Updated 2 years ago
- ☆92Updated last year
- Shaping capabilities with token-level pretraining data filtering☆75Updated 2 weeks ago
- ☆108Updated this week
- A puzzle to learn about prompting☆135Updated 2 years ago
- ☆31Updated last year
- Train vision models using JAX and 🤗 transformers☆100Updated last month
- ☆118Updated last week
- ☆37Updated last year
- 🧠 Starter templates for doing interpretability research☆76Updated 2 years ago
- A reading list of relevant papers and projects on foundation model annotation☆28Updated 11 months ago
- ☆47Updated 8 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆160Updated last year
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆86Updated 2 years ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆135Updated 3 years ago
- ☆284Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆153Updated last year
- Extract full next-token probabilities via language model APIs☆248Updated last year
- ☆153Updated 5 months ago
- WIP☆93Updated last year
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆65Updated 9 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆66Updated 2 months ago
- ☆77Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆186Updated 3 weeks ago
- Tools to make language models a bit easier to use☆64Updated last week
- Applying SAEs for fine-grained control☆25Updated last year
- A collection of lightweight interpretability scripts to understand how LLMs think☆89Updated 2 weeks ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Updated 2 years ago