Multiverse4FM / MultiverseLinks

☆81

Alternatives and similar repositories for Multiverse

Users that are interested in Multiverse are comparing it to the libraries listed below

Sorting:

Infini-AI-Lab / Multiverse
☆100Updated last month
OpenSparseLLMs / Linearization
☆61Updated 4 months ago
xichen-fy / Fira
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
☆115Updated last year
BaohaoLiao / RSD
[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.
☆50Updated 6 months ago
horseee / dKV-Cache
[NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models
☆114Updated 5 months ago
SalesforceAIResearch / GemFilter
☆85Updated 9 months ago
OpenSparseLLMs / Linear-MoE
☆120Updated 5 months ago
Tencent / llm.hunyuan.T1
☆85Updated 7 months ago
MiniMax-AI / SynLogic
[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
☆176Updated 4 months ago
yaof20 / Flash-RL
Implementation for FP8/INT8 Rollout for RL training without performence drop.
☆264Updated last month
ruipeterpan / specreason
PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]
☆57Updated last month
hyx1999 / SAM-Decoding
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
☆34Updated 8 months ago
TemporaryLoRA / Block-Attention
☆40Updated 7 months ago
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆86Updated 8 months ago
maomaocun / dLLM-cache
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…
☆174Updated 2 months ago
tilde-research / nsa-impl
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆124Updated 4 months ago
inclusionAI / Ring-V2
Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.
☆72Updated 2 weeks ago
zhijie-group / SIFT
SIFT: Grounding LLM Reasoning in Contexts via Stickers
☆58Updated 8 months ago
pengzhangzhi / Open-dLLM
The most open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.
☆364Updated last month
OpenSparseLLMs / MoM
☆105Updated last month
QwenLM / ParScale
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
☆450Updated 5 months ago
thu-nics / MoA
[CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
☆148Updated 3 months ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆213Updated 9 months ago
thu-ml / ReMoE
[ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.
☆97Updated 10 months ago
hao-ai-lab / Dynasor
[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.
☆201Updated 5 months ago
sail-sg / SimLayerKV
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
☆50Updated last year
NVlabs / Fast-dLLM
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
☆620Updated 2 weeks ago
microsoft / SeerAttention
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
☆168Updated last month
efficientscaling / Z1
[EMNLP'2025 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"
☆66Updated 6 months ago
qiuzh20 / gated_attention
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink…
☆101Updated last month