m-a-n-i-f-e-s-t/retention

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/m-a-n-i-f-e-s-t/retention)

m-a-n-i-f-e-s-t / retention

Language modeling with linear-cost context

☆119

Alternatives and similar repositories for retention

Users that are interested in retention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

m-a-n-i-f-e-s-t / power-attention
View on GitHub
Attention Kernels for Symmetric Power Transformers
☆130Sep 25, 2025Updated 9 months ago
bcml-labs / rosa-plus
View on GitHub
ROSA+: RWKV's ROSA implementation with fallback statistical predictor
☆36Oct 13, 2025Updated 9 months ago
LaunchPlatform / marketplace
View on GitHub
Marketplace ML experiment - training without backprop
☆28Sep 9, 2025Updated 10 months ago
RWKV / ZeroCoT
View on GitHub
https://x.com/BlinkDL_AI/status/1884768989743882276
☆28May 4, 2025Updated last year
thepowerfuldeez / sample_efficient_gpt
View on GitHub
Training framework with a goal to explore the frontier of sample efficiency of small language models
☆101Jan 25, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tobiaskatsch / GatedLinearRNN
View on GitHub
☆30Feb 27, 2024Updated 2 years ago
LeC-Z / RWKV-nonogram
View on GitHub
A 20M RWKV v6 can do nonogram
☆13Oct 18, 2024Updated last year
kyleliang919 / Super_Muon
View on GitHub
☆68Mar 21, 2025Updated last year
Noumena-Network / nmoe
View on GitHub
MoE training for Me and You and maybe other people
☆394Mar 15, 2026Updated 4 months ago
ZihaoHuang-notabot / Ultra-Sparse-Memory-Network
View on GitHub
☆48Jul 3, 2026Updated 2 weeks ago
hallerite / ludic
View on GitHub
Ludic – an LLM-RL library for the era of experience
☆67Jan 9, 2026Updated 6 months ago
boweiliu / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆11May 8, 2024Updated 2 years ago
MrYxJ / InfiniRetri
View on GitHub
☆52Feb 17, 2025Updated last year
fla-org / flame
View on GitHub
🔥 A minimal training framework for scaling FLA models
☆403Apr 22, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
goombalab / hydra
View on GitHub
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
☆174Jan 30, 2025Updated last year
nyonicai / nyonic-public
View on GitHub
Reference implementation of models from Nyonic Model Factory
☆12May 13, 2024Updated 2 years ago
sunblaze-ucb / rl-grok-recipe
View on GitHub
Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""
☆35Oct 12, 2025Updated 9 months ago
mkurman / neuroblast-v3
View on GitHub
NeuroBLAST v3 architecture code
☆37Jan 6, 2026Updated 6 months ago
tilde-research / one-layer-deeper
View on GitHub
☆38Updated this week
JinjieNi / dlms-are-super-data-learners
View on GitHub
The official github repo for "Diffusion Language Models are Super Data Learners".
☆227Nov 6, 2025Updated 8 months ago
recursal / minmodmon
View on GitHub
Mini Model Daemon
☆13Nov 9, 2024Updated last year
sdan / nanoEBM
View on GitHub
minimal Energy-based transformer
☆44Dec 11, 2025Updated 7 months ago
mcleish7 / retrofitting-recurrence
View on GitHub
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
☆68Nov 11, 2025Updated 8 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Ziems / arbor
View on GitHub
A framework for optimizing DSPy programs with RL
☆340Jan 12, 2026Updated 6 months ago
axolotl-ai-cloud / axolotl-cookbook
View on GitHub
☆39Aug 1, 2025Updated 11 months ago
Infini-AI-Lab / Sparrow
View on GitHub
☆16Jun 15, 2026Updated last month
NousResearch / nomos
View on GitHub
☆195Dec 18, 2025Updated 7 months ago
tengxiao1 / MR-Search
View on GitHub
Meta-Reinforcement Learning with Self-Reflection
☆33Mar 26, 2026Updated 3 months ago
RobertCsordas / moeut
View on GitHub
☆93Aug 18, 2024Updated last year
explosion / curated-tokenizers
View on GitHub
Lightweight piece tokenization library
☆12Apr 15, 2024Updated 2 years ago
yynil / RWKVInside
View on GitHub
☆41Apr 30, 2025Updated last year
naver-ai / KoBBQ
View on GitHub
Official code and dataset repository of KoBBQ (TACL 2024)
☆19May 13, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Chengsong-Huang / G-Zero
View on GitHub
☆25May 14, 2026Updated 2 months ago
rovle / gpt3-in-context-fitting
View on GitHub
Experiments on GPT-3's ability to fit numerical models in-context.
☆14Aug 11, 2022Updated 3 years ago
lindermanlab / elk
View on GitHub
Scalable and Stable Parallelization of Nonlinear RNNS
☆33Jun 28, 2026Updated 3 weeks ago
UCD4IDS / MultiscaleGraphSignalTransforms.jl
View on GitHub
MultiscaleGraphSignalTransforms.jl is a collection of software tools written in the Julia programming language for graph signal processin…
☆12Mar 15, 2026Updated 4 months ago
dnakov / hrm-mlx
View on GitHub
MLX implementation of Hierarchical Reasoning Model (HRM) - Adaptive computation for complex reasoning tasks
☆29Aug 27, 2025Updated 10 months ago
qlabs-eng / slowrun
View on GitHub
100M tokens. Infinite compute. Lowest val loss wins.
☆514Jul 3, 2026Updated 2 weeks ago
0xD4rky / nanotok
View on GitHub
☆27Jun 7, 2026Updated last month