MoonshotAI / Kimi-LinearLinks
☆1,184Updated this week
Alternatives and similar repositories for Kimi-Linear
Users that are interested in Kimi-Linear are comparing it to the libraries listed below
Sorting:
- ☆1,009Updated this week
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆829Updated this week
- ☆838Updated 2 months ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆451Updated 6 months ago
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆670Updated 3 weeks ago
- ☆702Updated last month
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…☆272Updated last week
- Official implementation of "Continuous Autoregressive Language Models"☆584Updated last week
- Dream 7B, a large diffusion language model☆1,081Updated last month
- Speed Always Wins: A Survey on Efficient Architectures for Large Language Models☆359Updated last week
- QeRL enables RL for 32B LLMs on a single H100 GPU.☆441Updated last month
- Scaling RL on advanced reasoning models☆632Updated last month
- Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"☆488Updated 2 weeks ago
- Muon is Scalable for LLM Training☆1,359Updated 3 months ago
- A Scientific Multimodal Foundation Model☆607Updated last month
- dLLM: Simple Diffusion Language Modeling☆950Updated this week
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆233Updated last week
- Simple & Scalable Pretraining for Neural Architecture Research☆300Updated 3 weeks ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆508Updated last month
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆223Updated 2 weeks ago
- ☆907Updated 2 weeks ago
- dInfer: An Efficient Inference Framework for Diffusion Language Models☆315Updated this week
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆130Updated 3 months ago
- Tina: Tiny Reasoning Models via LoRA☆305Updated 2 months ago
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆717Updated last month
- DFloat11: Lossless LLM Compression for Efficient GPU Inference☆560Updated 2 months ago
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆676Updated last month
- DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation☆763Updated 4 months ago
- ☆817Updated 5 months ago
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆194Updated last month