Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆117Mar 20, 2025Updated last year
Alternatives and similar repositories for Ouroboros
Users that are interested in Ouroboros are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.☆28Apr 15, 2025Updated 11 months ago
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆215Mar 5, 2026Updated 3 weeks ago
- ☆14Oct 3, 2024Updated last year
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆278Aug 31, 2024Updated last year
- Multi-Candidate Speculative Decoding☆40Apr 22, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆151Dec 23, 2025Updated 3 months ago
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆1,163Mar 9, 2026Updated 2 weeks ago
- ☆11Feb 5, 2026Updated last month
- Cascade Speculative Drafting☆33Apr 2, 2024Updated last year
- scalable and robust tree-based speculative decoding algorithm☆374Jan 28, 2025Updated last year
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,324Mar 6, 2025Updated last year
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆376Apr 22, 2025Updated 11 months ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆221Feb 13, 2025Updated last year
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆45Feb 13, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation