Infini-AI-Lab / MultiverseLinks

☆79

Alternatives and similar repositories for Multiverse

Users that are interested in Multiverse are comparing it to the libraries listed below

Sorting:

sustcsonglin / linear-attention-and-beyond-slides
☆78Updated 5 months ago
efficientscaling / Z1
Repo for "Z1: Efficient Test-time Scaling with Code"
☆63Updated 3 months ago
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆86Updated 10 months ago
xichen-fy / Fira
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
☆112Updated 9 months ago
SalesforceAIResearch / GemFilter
☆82Updated 6 months ago
UNITES-Lab / MC-SMoE
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
☆88Updated last month
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆103Updated 3 weeks ago
OpenSparseLLMs / Linearization
☆52Updated 3 weeks ago
MiroMindAsia / MiroMind-M1
☆84Updated last week
Lucky-Lance / Expert_Sparsity
[ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
☆95Updated last year
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆212Updated 6 months ago
OpenSparseLLMs / MoM
☆95Updated 3 months ago
OpenSparseLLMs / Linear-MoE
☆112Updated last month
sail-sg / LongSpec
LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆61Updated 2 weeks ago
yunfeixie233 / ViGaL
☆50Updated last month
Parallel-Reasoning / APR
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆116Updated 3 months ago
Infini-AI-Lab / gsm_infinite
☆51Updated last month
tilde-research / nsa-impl
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆108Updated last month
imagination-research / lbt
[NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study
☆52Updated 8 months ago
thu-ml / ReMoE
[ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.
☆85Updated 7 months ago
PKU-ML / LongPPL
Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"
☆92Updated last week
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆114Updated 4 months ago
GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆159Updated last week
Infini-AI-Lab / S2FT
☆19Updated 7 months ago
thu-nics / R2R
The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
☆43Updated this week
ltzheng / SimpleTIR
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆158Updated this week
Infini-AI-Lab / GRESO
☆45Updated last month
inclusionAI / Ring
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.
☆88Updated last month
LLM360 / Reasoning360
A repo for open research on building large reasoning models
☆84Updated this week
DAMO-NLP-SG / LongPO
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆38Updated 5 months ago