tyler-romero / microR1View external linksLinks
Simple repository for training small reasoning models
β49Feb 6, 2025Updated last year
Alternatives and similar repositories for microR1
Users that are interested in microR1 are comparing it to the libraries listed below
Sorting:
- DPO, but faster πβ47Dec 6, 2024Updated last year
- JAX implementation of GPTQ quantization algorithmβ10Jul 19, 2023Updated 2 years ago
- β17Jan 3, 2025Updated last year
- β14Apr 16, 2025Updated 10 months ago
- Smart reproducible analytical pipeline inspectionβ21Updated this week
- https://footprints.baulab.infoβ17Oct 4, 2024Updated last year
- β17Oct 9, 2023Updated 2 years ago
- nanogpt turned into a chat modelβ81Aug 30, 2023Updated 2 years ago
- Build your own visual reasoning modelβ419Jan 13, 2026Updated last month
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"β26Oct 14, 2025Updated 4 months ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed traβ¦β18Jan 5, 2023Updated 3 years ago
- β19May 6, 2023Updated 2 years ago
- A distributed GPU-centric experience replay system for large AI models.β18Aug 1, 2023Updated 2 years ago
- cheap & easy LLM experiments for amateurs (alpha)β25Nov 30, 2025Updated 2 months ago
- β20Jun 11, 2023Updated 2 years ago
- QLoRA for Masked Language Modelingβ22Sep 11, 2023Updated 2 years ago
- Build complex types from simple blueprints with Pydanticβ26Feb 8, 2026Updated last week
- Code for Named Entity Recognition using Deep Bidirectional LSTM(Char and Word Level Embedding) + HighWay Layer + CRFβ23Dec 6, 2017Updated 8 years ago
- A banchmark list for evaluation of large language models.β159Jan 20, 2026Updated 3 weeks ago
- A collection of Kanren implementations in Juliaβ24Oct 14, 2025Updated 4 months ago
- Code for "Counterfactual Token Generation in Large Language Models", Arxiv 2024.β32Nov 7, 2024Updated last year
- Official pytorch implementation of ZiRa, a method for incremental vision language object detection (IVLOD)οΌwhich has been accepted by Neuβ¦β28Oct 22, 2024Updated last year
- Slack bot that publishes a team's pull requests to their Slack channelβ26Updated this week
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acceβ¦β25Apr 18, 2024Updated last year
- Notebooks for Fastai Viusal Guideβ25Apr 12, 2023Updated 2 years ago
- β27Mar 14, 2024Updated last year
- Library for training process reward modelsβ29Jun 3, 2025Updated 8 months ago
- This repository contain the simple llama3 implementation in pure jax.β71Feb 17, 2025Updated 11 months ago
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projecβ¦β32Jan 6, 2026Updated last month
- Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more wβ¦β32Sep 17, 2018Updated 7 years ago
- rl from zero pretrain, can it be done? yes.β286Sep 28, 2025Updated 4 months ago
- β12Oct 7, 2020Updated 5 years ago
- Forked from https://gitlab.com/MatejB/PrePoMaxβ12Jan 8, 2024Updated 2 years ago
- Dive into Jax, Flax, XLA and C++β32Apr 1, 2020Updated 5 years ago
- awesome synthetic (text) datasetsβ323Jan 8, 2026Updated last month
- Async RL Training at Scaleβ1,071Updated this week
- Official code for TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representationsβ36Jan 24, 2026Updated 3 weeks ago
- MobileLLM-R1β75Sep 30, 2025Updated 4 months ago
- β42Apr 22, 2025Updated 9 months ago