Luowaterbi / TokenRecyclingLinks
[ACL2025 Oral๐ฅ]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
โ21Updated 2 months ago
Alternatives and similar repositories for TokenRecycling
Users that are interested in TokenRecycling are comparing it to the libraries listed below
Sorting:
- โ49Updated last year
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**โ214Updated 10 months ago
- A Comprehensive Survey on Long Context Language Modelingโ216Updated last month
- โ299Updated 6 months ago
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)โ52Updated 9 months ago
- โ71Updated 8 months ago
- [ICLR 2025๐ฅ] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Modelsโ25Updated 6 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)โ350Updated 8 months ago
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Lengthโ144Updated 2 weeks ago
- [EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMsโ197Updated last month
- โ29Updated 3 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"โ243Updated 3 months ago
- Evaluation utilities based on SymPy.โ21Updated last year
- REST: Retrieval-Based Speculative Decoding, NAACL 2024โ213Updated 4 months ago
- Repository of LV-Eval Benchmarkโ73Updated last year
- โ126Updated 7 months ago
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialoguesโ136Updated last year
- Codes for the paper "โBench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718โ368Updated last year
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMsโ182Updated 3 months ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learningโ181Updated this week
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Samplingโ49Updated 5 months ago
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluationโ61Updated 9 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!โ71Updated 9 months ago
- Reproducing R1 for Code with Reliable Rewardsโ278Updated 8 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It containsโฆโ255Updated 4 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.โ96Updated 2 months ago
- โ41Updated 9 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ270Updated last year
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionโ54Updated last year
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]โ61Updated 3 months ago