[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.
☆56May 2, 2025Updated 10 months ago
Alternatives and similar repositories for RSD
Users that are interested in RSD are comparing it to the libraries listed below
Sorting:
- ☆20May 14, 2025Updated 9 months ago
- Make reasoning models scalable☆47May 31, 2025Updated 9 months ago
- ☆32Oct 13, 2025Updated 4 months ago
- ThinK: Thinner Key Cache by Query-Driven Pruning☆27Feb 11, 2025Updated last year
- [arXiv 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"☆15Apr 3, 2025Updated 11 months ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆65Oct 2, 2025Updated 5 months ago
- Agent-RRM: Exploring Reasoning Reward Model for Agents☆49Updated this week
- SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)☆34Nov 28, 2025Updated 3 months ago
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆22Nov 9, 2025Updated 4 months ago
- ☆21Updated this week
- VidKV: Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models☆26Mar 26, 2025Updated 11 months ago
- ☆35Jan 16, 2026Updated last month
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆21Jun 11, 2025Updated 8 months ago
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆41Oct 31, 2025Updated 4 months ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆74Jul 14, 2025Updated 7 months ago
- An adaptive sampling framework for Reinforce-style LLM post training.☆92Nov 29, 2025Updated 3 months ago
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆64Feb 21, 2025Updated last year
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆277Aug 31, 2024Updated last year
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆102Nov 22, 2025Updated 3 months ago
- This repository is the official implementation of "Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE"☆37Oct 5, 2025Updated 5 months ago
- (ACL 2025 Main) Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillat…☆34Aug 23, 2025Updated 6 months ago
- Introduction about AWESOME_ENTROPY+LRM_PAPERS☆30Dec 16, 2025Updated 2 months ago
- ☆50Aug 21, 2025Updated 6 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆263May 5, 2025Updated 10 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆133Apr 12, 2025Updated 10 months ago
- General Use Timeseries Containers for Rust☆11Dec 31, 2020Updated 5 years ago
- ☆12Oct 4, 2021Updated 4 years ago
- Jupyter notebook templates for processing and analyzing neuroscience data.☆14Updated this week
- JMLR Cover Letter Template☆10Dec 15, 2021Updated 4 years ago
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆30Jan 27, 2026Updated last month
- Knowledge sharing of AWS (Amazon Web Services) Cloud☆12Jun 7, 2021Updated 4 years ago
- ☆13Feb 4, 2025Updated last year
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- Analytics tool that applies Natural Language Processing (NLP) and Machine Learning (ML), such as concept extraction, idea classification,…☆10Dec 7, 2022Updated 3 years ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- A useful opinionated repository of Artificial Intelligence ecosystem resources - AI event calendars, papers, books, articles, videos, AI …☆19Mar 1, 2026Updated last week
- Official code repository for "CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion".☆27Jan 27, 2026Updated last month
- Integrating neurosymbolic representations into LLMs for interpretability, steering, and running symbolic algorithms☆14Feb 2, 2026Updated last month