inclusionAI / RingLinks
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.
☆41Updated last month
Alternatives and similar repositories for Ring
Users that are interested in Ring are comparing it to the libraries listed below
Sorting:
- ☆95Updated 2 weeks ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆73Updated 7 months ago
- ☆100Updated last week
- A Sober Look at Language Model Reasoning☆63Updated last week
- One-shot Entropy Minimization☆119Updated this week
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆69Updated last week
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆42Updated this week
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆46Updated 7 months ago
- ☆83Updated last month
- ☆22Updated 10 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆103Updated this week
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆84Updated last year
- LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification☆54Updated 3 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆97Updated 4 months ago
- ☆19Updated 5 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated last week
- ☆47Updated 2 months ago
- ☆16Updated 3 weeks ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆66Updated 6 months ago
- SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference☆46Updated 6 months ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆31Updated last year
- qwen-nsa☆66Updated last month
- ☆9Updated 9 months ago
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".☆14Updated 8 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆69Updated 3 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆65Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆98Updated last month
- [ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"☆73Updated 5 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆60Updated 5 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆110Updated this week