tyler-romero/microR1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tyler-romero/microR1)

tyler-romero / microR1

Simple repository for training small reasoning models

☆51

Alternatives and similar repositories for microR1

Users that are interested in microR1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

brendanhogan / 2025_advent_of_small_ml
View on GitHub
☆22Dec 24, 2025Updated 7 months ago
tyler-romero / nanogpt-speedrun
View on GitHub
NanoGPT (124M) as fast as possible
☆20Apr 15, 2025Updated last year
andrew-silva / mlx-rlhf
View on GitHub
An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.
☆37Jun 21, 2024Updated 2 years ago
willccbb / localchat
View on GitHub
☆13Apr 16, 2025Updated last year
haizelabs / nyc-ai-reading
View on GitHub
nyc is so back
☆21Jun 27, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
aeturrell / smartrappy
View on GitHub
Smart reproducible analytical pipeline inspection
☆21Feb 13, 2026Updated 5 months ago
deepvk / muse
View on GitHub
🎵 muse: Music Separation
☆11Feb 14, 2024Updated 2 years ago
brendanhogan / picoDeepResearch
View on GitHub
☆69May 23, 2025Updated last year
ml-jku / plstm_experiments
View on GitHub
☆16Oct 21, 2025Updated 9 months ago
uygarkurt / BERT-PyTorch
View on GitHub
☆17Jan 3, 2025Updated last year
N8python / mlx-pretrain
View on GitHub
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆85Aug 20, 2025Updated 11 months ago
McGill-NLP / nano-aha-moment
View on GitHub
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆625Oct 7, 2025Updated 9 months ago
davisyoshida / jax-gptq
View on GitHub
JAX implementation of GPTQ quantization algorithm
☆10Jul 19, 2023Updated 3 years ago
tianshao1992 / DENO4pytorch
View on GitHub
Differential equation neural operator
☆22Sep 4, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
taylorai / lm-deluge
View on GitHub
utilities for batched llm calls with retries
☆51Updated this week
textcortex / claude-code-pr-autodoc-action
View on GitHub
A GitHub Actions Workflow that will let Claude Code auto-generate documentations for your PRs after getting merged
☆15Jun 10, 2025Updated last year
Shekswess / tiny-reasoning-language-model
View on GitHub
Code repository dedicated to experimenting and research with tiny reasoning language model
☆52Nov 24, 2025Updated 8 months ago
PrimeIntellect-ai / lab-cookbook
View on GitHub
Lab Cookbook
☆38Updated this week
collinear-ai / spider
View on GitHub
Streamline on-policy/off-policy distillation workflows in a few lines of code
☆109Updated this week
groundlight / r1_vlm
View on GitHub
Build your own visual reasoning model
☆421Jan 13, 2026Updated 6 months ago
arc-community / arc-generative-DSL-infinite-data
View on GitHub
slowly building a set of infinite riddle generators for data-hungry methods
☆14Nov 15, 2022Updated 3 years ago
gabrielpetersson / simple-variational-auto-encoder
View on GitHub
a simple variational auto encoder with some exploration
☆13Nov 22, 2024Updated last year
strangeloopcanon / tevo
View on GitHub
TEVO: evolve LM motifs cheaply, then validate them in downstream train.py loops.
☆19Apr 18, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
andrew-silva / clean-rl-mlx
View on GitHub
Clean RL implementation using MLX
☆34Mar 8, 2024Updated 2 years ago
frankxwang / dpo-prefix-sharing
View on GitHub
DPO, but faster 🚀
☆52Dec 6, 2024Updated last year
Ziems / arbor
View on GitHub
A framework for optimizing DSPy programs with RL
☆340Jan 12, 2026Updated 6 months ago
Weixin-Liang / Mixture-of-Mamba
View on GitHub
☆51Jan 28, 2025Updated last year
tatsu432 / BDCM
View on GitHub
☆17Mar 24, 2026Updated 4 months ago
wassname / world-models-sonic-pytorch
View on GitHub
Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more w…
☆33Sep 17, 2018Updated 7 years ago
darrow-labs / LegalLens
View on GitHub
☆10Jul 15, 2024Updated 2 years ago
sfeucht / footprints
View on GitHub
https://footprints.baulab.info
☆17Oct 4, 2024Updated last year
dbigham / ARC
View on GitHub
Abstraction and Reasoning Corpus
☆15Nov 22, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
malteos / legal-document-similarity
View on GitHub
Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …
☆32Apr 29, 2021Updated 5 years ago
ChrisHayduk / QLoRA-for-MLM
View on GitHub
QLoRA for Masked Language Modeling
☆23Sep 11, 2023Updated 2 years ago
zhangxjohn / LLM-Agent-Benchmark-List
View on GitHub
A banchmark list for evaluation of large language models.
☆167Jul 10, 2026Updated 2 weeks ago
muellerzr / smol-moe
View on GitHub
☆25Oct 10, 2025Updated 9 months ago
haizelabs / annotate
View on GitHub
Skill to annotate and create ai judges from agent logs
☆17Oct 28, 2025Updated 9 months ago
teilomillet / retrain
View on GitHub
a Python library that uses Reinforcement Learning (RL) to train LLMs.
☆43Jul 12, 2026Updated 2 weeks ago