Kamichanw / CoSLinks
[ICML'25] Official code of paper "Fast Large Language Model Collaborative Decoding via Speculation"
☆28Updated 4 months ago
Alternatives and similar repositories for CoS
Users that are interested in CoS are comparing it to the libraries listed below
Sorting:
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆87Updated 9 months ago
- ☆30Updated 2 months ago
- ☆20Updated last week
- Model merging is a highly efficient approach for long-to-short reasoning.☆89Updated last month
- ☆45Updated last month
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆88Updated 2 weeks ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆127Updated 3 weeks ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆59Updated 6 months ago
- The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink…☆105Updated 2 months ago
- A Sober Look at Language Model Reasoning☆87Updated this week
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆56Updated 3 weeks ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆138Updated last week
- The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"☆20Updated 7 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆63Updated 11 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆94Updated 7 months ago
- ☆46Updated 7 months ago
- The official repository of the Omni-MATH benchmark.☆88Updated 11 months ago
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆88Updated 11 months ago
- ☆112Updated 5 months ago
- ☆106Updated 2 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆33Updated 5 months ago
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)☆167Updated 2 weeks ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆80Updated 5 months ago
- ☆136Updated 2 months ago
- ☆165Updated last month
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆83Updated 7 months ago
- ☆69Updated 5 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆58Updated 8 months ago
- ☆20Updated 11 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆123Updated 7 months ago