hemingkx / SWIFTLinks
[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
☆61Updated 10 months ago
Alternatives and similar repositories for SWIFT
Users that are interested in SWIFT are comparing it to the libraries listed below
Sorting:
- ☆49Updated last year
- ☆126Updated 6 months ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆60Updated 2 months ago
- [EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆197Updated 3 weeks ago
- ☆114Updated 3 months ago
- ☆140Updated 3 months ago
- Multi-Candidate Speculative Decoding