lucidrains / HRMLinks

Exploration into the proposed architecture from Sapient Intelligence of Singapore 🇸🇬

☆72

Alternatives and similar repositories for HRM

Users that are interested in HRM are comparing it to the libraries listed below

Sorting:

LucasPrietoAl / grokking-at-the-edge-of-numerical-stability
☆105Updated 4 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆186Updated 10 months ago
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆146Updated 9 months ago
xjdr-alt / llmri
look how they massacred my boy
☆63Updated last year
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 10 months ago
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆111Updated 7 months ago
joshuacnf / Ctrl-G
☆105Updated 11 months ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆100Updated last year
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆149Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 9 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 10 months ago
QuixiAI / grokadamw
☆136Updated last year
xjdr-alt / muzero_sketch
☆40Updated last year
bloc97 / DeMo
DeMo: Decoupled Momentum Optimization
☆197Updated last year
doomslide / hyperobject
Plotting (entropy, varentropy) for small LMs
☆99Updated 6 months ago
jerber / lang-jepa
☆129Updated 11 months ago
facebookresearch / llm-speedrunner
The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…
☆112Updated last month
SakanaAI / evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆330Updated last year
microsoft / GRIN-MoE
GRadient-INformed MoE
☆264Updated last year
epfml / DenseFormer
☆82Updated last year
flawedmatrix / mamba-ssm
Implementation of mamba with rust
☆88Updated last year
nivibilla / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆68Updated last year
JD-P / minihf
MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…
☆182Updated last month
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 9 months ago
LAION-AI / AIW
Alice in Wonderland code base for experiments and raw experiments data
☆131Updated 2 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆85Updated 7 months ago
main-horse / hnet-old
H-Net Dynamic Hierarchical Architecture
☆80Updated 2 months ago
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆218Updated last month
SonicCodes / lucid-v1
realtime latent world model inference demo
☆48Updated last year