nano-R1 / resourcesLinks

Compiling useful links, papers, benchmarks, ideas, etc.

☆45

Alternatives and similar repositories for resources

Users that are interested in resources are comparing it to the libraries listed below

Sorting:

LeonGuertler / UnstableBaselines
☆106Updated last month
tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆281Updated 2 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 8 months ago
tyler-romero / microR1
Simple repository for training small reasoning models
☆46Updated 9 months ago
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆73Updated 7 months ago
brendanhogan / DeepSeekRL-Extended
Exploring Applications of GRPO
☆249Updated 3 months ago
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆139Updated last year
goodfire-ai / r1-interpretability
Open source interpretability artefacts for R1.
☆163Updated 7 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 8 months ago
brendanhogan / picoDeepResearch
☆68Updated 6 months ago
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 10 months ago
PrimeIntellect-ai / prime-environments
Training-Ready RL Environments + Evals
☆182Updated this week
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 11 months ago
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆60Updated 6 months ago
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆218Updated last month
PrimeIntellect-ai / genesys
☆136Updated 8 months ago
SinatrasC / entropix
Entropy Based Sampling and Parallel CoT Decoding
☆17Updated last year
facebookresearch / llm-speedrunner
The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…
☆112Updated last month
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆149Updated last year
jerber / lang-jepa
☆128Updated 11 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆59Updated last month
haizelabs / j1-micro
j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
☆99Updated 4 months ago
PrimeIntellect-ai / pi-quant
SIMD quantization kernels
☆92Updated 2 months ago
google-deepmind / mishax
☆143Updated 2 months ago
divyamakkar0 / JAXformer
A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.
☆105Updated 2 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 9 months ago
doomslide / hyperobject
Plotting (entropy, varentropy) for small LMs
☆99Updated 6 months ago
Amplify-Partners / annotation-reading-list
A reading list of relevant papers and projects on foundation model annotation
☆28Updated 9 months ago
okarthikb / state-space-models
☆28Updated last year
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆70Updated 6 months ago