smart-lty / ParallelSpeculativeDecoding
[ICLR 2025] PEARL: parallel speculative decoding with adaptive draft length
☆37Updated this week
Alternatives and similar repositories for ParallelSpeculativeDecoding:
Users that are interested in ParallelSpeculativeDecoding are comparing it to the libraries listed below
- [NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exitin…☆48Updated 7 months ago
- Multi-Candidate Speculative Decoding☆34Updated 9 months ago
- ☆36Updated 2 months ago
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆58Updated 3 weeks ago
- ☆62Updated 2 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)