Red-Hat-AI-Innovation-Team / mini_trainerLinks

fast trainer for educational purposes

☆18

Alternatives and similar repositories for mini_trainer

Users that are interested in mini_trainer are comparing it to the libraries listed below

Sorting:

protagolabs / odyssey-math
☆83Updated 8 months ago
anadim / the-little-retrieval-test
☆34Updated 2 years ago
McGill-NLP / VinePPO
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
☆175Updated 4 months ago
princeton-nlp / HELMET
The HELMET Benchmark
☆177Updated 2 months ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆233Updated 3 weeks ago
lmarena / PPE
☆53Updated 5 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
princeton-nlp / ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
☆230Updated last month
ServiceNow / PipelineRL
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆207Updated last week
princeton-nlp / USACO
Can Language Models Solve Olympiad Programming?
☆118Updated 9 months ago
google-deepmind / loft
LOFT: A 1 Million+ Token Long-Context Benchmark
☆218Updated 4 months ago
gregorbachmann / Next-Token-Failures
☆101Updated last year
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆123Updated last year
R2E-Gym / R2E-Gym
[COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
☆170Updated 3 months ago
berlino / seq_icl
☆53Updated last year
HKUNLP / diffusion-vs-ar
[ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"
☆77Updated 8 months ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated last year
hughbzhang / o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
☆90Updated 10 months ago
GFNOrg / gfn-lm-tuning
☆186Updated last year
mnoukhov / async_rlhf
Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
☆63Updated 5 months ago
booydar / babilong
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
☆213Updated last month
METR / RE-Bench
☆112Updated this week
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆84Updated 11 months ago
ars22 / scaling-LLM-math-synthetic-data
Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"
☆30Updated last year
lee-ny / teaching_arithmetic
☆83Updated 2 years ago
mandyyyyii / scibench
☆128Updated last year
wellecks / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆24Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆131Updated 10 months ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆77Updated 2 months ago
princeton-nlp / LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
☆78Updated 2 years ago