Equationliu/Kangaroo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Equationliu/Kangaroo)

Equationliu / Kangaroo

[NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting"

☆72

Alternatives and similar repositories for Kangaroo

Users that are interested in Kangaroo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zankner / Hydra
View on GitHub
☆55Feb 19, 2024Updated 2 years ago
hemingkx / Spec-Bench
View on GitHub
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
☆401Apr 22, 2025Updated last year
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
yc2367 / BBS-MICRO
View on GitHub
☆19Nov 11, 2024Updated last year
dilab-zju / self-speculative-decoding
View on GitHub
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆230Feb 13, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
ASISys / AdaSkip
View on GitHub
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
☆21Jan 24, 2025Updated last year
oujieww / ANPD
View on GitHub
☆11Feb 5, 2026Updated 5 months ago
hemingkx / SpeculativeDecodingPapers
View on GitHub
📰 Must-read papers and blogs on Speculative Decoding ⚡️
☆1,278Jun 27, 2026Updated 3 weeks ago
Lyun0912-wu / LongAttn
View on GitHub
LongAttn ：Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated last year
sail-sg / LongSpec
View on GitHub
[ACL 2026 (Main)] LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆84Jul 14, 2025Updated last year
Jingyu6 / speculative_prefill
View on GitHub
☆63May 19, 2025Updated last year
smart-lty / ParallelSpeculativeDecoding
View on GitHub
[ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length
☆170Dec 23, 2025Updated 7 months ago
ningding-o / MeKi
View on GitHub
Homepage for paper “MeKi : Memory-based Expert Knowledge Injection for Efficient LLM Scaling”
☆29Mar 5, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AutonomicPerfectionist / PipeInfer
View on GitHub
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
☆32Nov 16, 2024Updated last year
dongwonjo / FastKV
View on GitHub
[ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…
☆32Apr 14, 2026Updated 3 months ago
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated 2 years ago
microsoft / chunk-attention
View on GitHub
☆89Apr 18, 2025Updated last year
FasterDecoding / REST
View on GitHub
REST: Retrieval-Based Speculative Decoding, NAACL 2024
☆220Mar 5, 2026Updated 4 months ago
FMInference / DejaVu
View on GitHub
☆359Apr 2, 2024Updated 2 years ago
VITA-Group / Q-Hitter
View on GitHub
☆15Jun 4, 2024Updated 2 years ago
SuDIS-ZJU / Efficient-LVLMs-Inference
View on GitHub
[ACL 2026 Findings] Living repository for the survey paper “Efficient Inference for Large Vision-Language Models: Bottlenecks, Techniques…
☆26Apr 8, 2026Updated 3 months ago
VITA-Group / WeLore
View on GitHub
[ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications
☆52Oct 30, 2025Updated 8 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
yichen14 / FastAdaSP
View on GitHub
Code for the paper "FastAdaSP: An Efficient Multitask Inference Framework for Large Speech Language Models". @ EMNLP'24(Oral)
☆17Nov 14, 2024Updated last year
LiuXiaoxuanPKU / OSD
View on GitHub
☆68Dec 3, 2024Updated last year
ChandlerGuan / Transkimmer
View on GitHub
Code for ACL2022 publication Transkimmer: Transformer Learns to Layer-wise Skim
☆22Aug 21, 2022Updated 3 years ago
linfeng93 / BiTA
View on GitHub
An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.
☆28Apr 15, 2025Updated last year
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
ArmelRandy / tree-of-problems
View on GitHub
[EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality
☆20Mar 4, 2025Updated last year
InternLM / Condor
View on GitHub
[ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
☆40May 28, 2025Updated last year
SafeAILab / EAGLE
View on GitHub
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
☆2,476Feb 20, 2026Updated 5 months ago
hgyhungry / alcop-artifact
View on GitHub
☆25Mar 15, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
haiduo / Jakiro
View on GitHub
This repository is the official implementation of "Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE" [ACL 2026 Mai…
☆37Oct 5, 2025Updated 9 months ago
feifeibear / LLMSpeculativeSampling
View on GitHub
Fast inference from large lauguage models via speculative decoding
☆921Aug 22, 2024Updated last year
zhuohan123 / terapipe
View on GitHub
☆79May 4, 2021Updated 5 years ago
Infini-AI-Lab / Sequoia
View on GitHub
scalable and robust tree-based speculative decoding algorithm
☆376Jan 28, 2025Updated last year
hdong920 / GRIFFIN
View on GitHub
☆40Aug 27, 2024Updated last year
HanGuo97 / lq-lora
View on GitHub
☆129Jan 22, 2024Updated 2 years ago
HArmonizedSS / HASS
View on GitHub
Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)
☆56Mar 14, 2025Updated last year