shaochenze / calmLinks

Official implementation of "Continuous Autoregressive Language Models"

☆79

Alternatives and similar repositories for calm

Users that are interested in calm are comparing it to the libraries listed below

Sorting:

aakaran / reasoning-with-sampling
☆281Updated 2 weeks ago
s-sahoo / Eso-LMs
Esoteric Language Models
☆104Updated last month
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆129Updated last week
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 6 months ago
convergence-ai / lm2
Official repo of paper LM2
☆46Updated 8 months ago
StigLidu / DualDistill
[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆101Updated 2 months ago
ZihanWang314 / CoE
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
☆222Updated this week
RWKV / RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…
☆53Updated 7 months ago
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 8 months ago
MetaStone-AI / MetaStone-S1
The open-source code of MetaStone-S1.
☆107Updated 3 months ago
hamishivi / tess-2
Repository for "TESS-2: A Large-Scale, Generalist Diffusion Language Model"
☆51Updated 8 months ago
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆78Updated 6 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
erogol / BlaGPT
Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…
☆84Updated 3 weeks ago
nahidalam / maya
Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya
☆117Updated 3 months ago
NathanGodey / qfilters
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆35Updated 8 months ago
fal-ai-community / nano-mdm
Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun
☆57Updated 7 months ago
RobertCsordas / moeut
☆86Updated last year
zjunlp / DynamicKnowledgeCircuits
[ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
☆45Updated 3 months ago
JinjieNi / MegaDLMs
GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…
☆89Updated this week
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆110Updated 6 months ago
facebookresearch / ExploreToM
Code for ExploreTom
☆86Updated 4 months ago
allenai / IFBench
☆86Updated 2 weeks ago
VITA-Group / WeLore
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…
☆51Updated last week
swt-user / DMPO
☆50Updated last year
ZihanWang314 / coeCheck
☆19Updated 8 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆185Updated 9 months ago
JayZhang42 / SLED
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
☆108Updated 11 months ago
RWKV / ZeroCoT
https://x.com/BlinkDL_AI/status/1884768989743882276
☆28Updated 6 months ago
yannqi / R-4B
The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"
☆120Updated 2 months ago