lucidrains / HRMLinks
Exploration into the proposed architecture from Sapient Intelligence of Singapore πΈπ¬
β72Updated 3 months ago
Alternatives and similar repositories for HRM
Users that are interested in HRM are comparing it to the libraries listed below
Sorting:
- β105Updated 4 months ago
- PyTorch implementation of models from the Zamba2 series.β186Updated 10 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.β41Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokensβ146Updated 9 months ago
- look how they massacred my boyβ63Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β173Updated 10 months ago
- EvaByte: Efficient Byte-level Language Models at Scaleβ111Updated 7 months ago
- β105Updated 11 months ago
- RWKV-7: Surpassing GPTβ100Updated last year
- smolLM with Entropix sampler on pytorchβ149Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)β108Updated 9 months ago
- Simple GRPO scripts and configurations.β59Updated 10 months ago
- β136Updated last year
- β40Updated last year
- DeMo: Decoupled Momentum Optimizationβ197Updated last year
- Plotting (entropy, varentropy) for small LMsβ99Updated 6 months ago
- β129Updated 11 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languagβ¦β112Updated last month
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β330Updated last year
- GRadient-INformed MoEβ264Updated last year
- β82Updated last year
- Implementation of mamba with rustβ88Updated last year
- Video+code lecture on building nanoGPT from scratchβ68Updated last year
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the userβ¦β182Updated last month
- Train your own SOTA deductive reasoning modelβ107Updated 9 months ago
- Alice in Wonderland code base for experiments and raw experiments dataβ131Updated 2 months ago
- Collection of autoregressive model implementationβ85Updated 7 months ago
- H-Net Dynamic Hierarchical Architectureβ80Updated 2 months ago
- Storing long contexts in tiny caches with self-studyβ218Updated last month
- realtime latent world model inference demoβ48Updated last year