SinatrasC / entropix-smollmLinks

smolLM with Entropix sampler on pytorch

☆150

Alternatives and similar repositories for entropix-smollm

Users that are interested in entropix-smollm are comparing it to the libraries listed below

Sorting:

doomslide / hyperobject
Plotting (entropy, varentropy) for small LMs
☆98Updated 2 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆140Updated 5 months ago
xjdr-alt / llmri
look how they massacred my boy
☆63Updated 9 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆103Updated 4 months ago
QuixiAI / grokadamw
☆134Updated 11 months ago
xjdr-alt / entropix-local
smol models are fun too
☆93Updated 8 months ago
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 6 months ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 7 months ago
jerber / lang-jepa
☆118Updated 7 months ago
smolorg / smoltropix
MLX port for xjdr's entropix sampler (mimics jax implementation)
☆62Updated 9 months ago
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆68Updated 3 months ago
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆138Updated last year
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆103Updated 4 months ago
N8python / mlx-pretrain
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆83Updated 3 months ago
samefarrar / entropix_mlx
Modify Entropy Based Sampling to work with Mac Silicon via MLX
☆49Updated 8 months ago
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆68Updated 3 months ago
kubernetes-bad / reward-composer
Lego for GRPO
☆28Updated 2 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 5 months ago
xjdr-alt / muzero_sketch
☆38Updated last year
bloc97 / DeMo
DeMo: Decoupled Momentum Optimization
☆190Updated 8 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆55Updated 6 months ago
SinatrasC / entropix
Entropy Based Sampling and Parallel CoT Decoding
☆17Updated 9 months ago
QuixiAI / spectrum
☆128Updated 3 months ago
JD-P / minihf
MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…
☆177Updated 2 weeks ago
joshuacnf / Ctrl-G
☆87Updated 6 months ago
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆45Updated 2 months ago
brendanhogan / picoDeepResearch
☆64Updated 2 months ago
magicproduct / hash-hop
Long context evaluation for large language models
☆220Updated 5 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? we'll see.
☆66Updated 2 weeks ago