lfsszd/CS-Drafting

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lfsszd/CS-Drafting)

lfsszd / CS-Drafting

Cascade Speculative Drafting

☆33

Alternatives and similar repositories for CS-Drafting

Users that are interested in CS-Drafting are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NJUNLP / MCSD
View on GitHub
Multi-Candidate Speculative Decoding
☆41Apr 22, 2024Updated 2 years ago
linfeng93 / BiTA
View on GitHub
An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.
☆28Apr 15, 2025Updated last year
FasterDecoding / REST
View on GitHub
REST: Retrieval-Based Speculative Decoding, NAACL 2024
☆220Mar 5, 2026Updated 4 months ago
salesforce / simplification
View on GitHub
☆23Jun 25, 2026Updated 3 weeks ago
saint0x / tl
View on GitHub
timelapse is the git primitive for agents: continuous, lossless checkpoint streams that capture every working state.
☆17Apr 3, 2026Updated 3 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
thunlp / Ouroboros
View on GitHub
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆118Mar 20, 2025Updated last year
lucidrains / speculative-decoding
View on GitHub
Explorations into some recent techniques surrounding speculative decoding
☆307Dec 22, 2024Updated last year
kssteven418 / BigLittleDecoder
View on GitHub
[NeurIPS'23] Speculative Decoding with Big Little Decoder
☆99Feb 6, 2024Updated 2 years ago
oujieww / ANPD
View on GitHub
☆11Feb 5, 2026Updated 5 months ago
Frostlinx / Socratic-Zero
View on GitHub
Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning
☆37Oct 26, 2025Updated 8 months ago
dilab-zju / self-speculative-decoding
View on GitHub
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆230Feb 13, 2025Updated last year
BenGardiner123 / langchainjs-chat-with-your-github
View on GitHub
This is the starter code for an example of storing a github repo in a vector store and chatting with it as a knowledge base
☆16Jun 22, 2023Updated 3 years ago
VITA-Group / LoCoCo
View on GitHub
[ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen
☆17Sep 7, 2024Updated last year
Infini-AI-Lab / MagicDec
View on GitHub
[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding
☆154Dec 4, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
thunlp / NOSA
View on GitHub
The official implementation of NOSA
☆19Jun 11, 2026Updated last month
jbilcke-hf / atryon
View on GitHub
[WIP] AI Try-On plugin for Chrome
☆28Mar 16, 2024Updated 2 years ago
hemingkx / SpecDec
View on GitHub
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
☆47Dec 9, 2023Updated 2 years ago
feifeibear / LLMSpeculativeSampling
View on GitHub
Fast inference from large lauguage models via speculative decoding
☆920Aug 22, 2024Updated last year
Zcchill / Value-Residual-Learning
View on GitHub
☆15Mar 20, 2025Updated last year
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Updated this week
hemingkx / Spec-Bench
View on GitHub
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
☆401Apr 22, 2025Updated last year
raymin0223 / fast_robust_early_exit
View on GitHub
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
☆67Sep 28, 2024Updated last year
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
UbiquitousLearning / SLM_Survey
View on GitHub
☆109Oct 2, 2024Updated last year
facebookresearch / LayerSkip
View on GitHub
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
☆371Updated this week
MachineLearningSystem / 25ASPLOS-Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆12Nov 8, 2024Updated last year
fasa-org / dash-attention
View on GitHub
DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention
☆21May 25, 2026Updated last month
ise-uiuc / uniapr
View on GitHub
Fast and Precise On-the-fly Patch Validation for All
☆10Feb 24, 2023Updated 3 years ago
Infini-AI-Lab / Sequoia
View on GitHub
scalable and robust tree-based speculative decoding algorithm
☆376Jan 28, 2025Updated last year
thunlp / FR-Spec
View on GitHub
[ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling
☆55Jul 15, 2025Updated last year
VITA-Group / Ms-PoE
View on GitHub
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…
☆35May 7, 2024Updated 2 years ago
Hanpx20 / SafeSwitch
View on GitHub
Official code repository for the paper "Internal Activation as the Polar Star for Steering Unsafe LLM Behavior"
☆15May 31, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
CGCL-codes / HistFuzz
View on GitHub
A practical fuzzing tool for SMT solvers
☆11Nov 26, 2025Updated 7 months ago
sail-sg / LongSpec
View on GitHub
[ACL 2026 (Main)] LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆84Jul 14, 2025Updated last year
wudu98 / autoGEMM
View on GitHub
☆15Dec 5, 2024Updated last year
RTkenny / RiskPO
View on GitHub
Official implementation of 'RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM Post-Training', accepted by ICLR 2026
☆18Oct 15, 2025Updated 9 months ago
nimish15shah / DAG_Processor
View on GitHub
A DAG processor and compiler for a tree-based spatial datapath.
☆16Aug 24, 2022Updated 3 years ago
KaiNylund / lm-weights-encode-time
View on GitHub
☆68Aug 16, 2024Updated last year
SafeAILab / EAGLE
View on GitHub
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
☆2,471Feb 20, 2026Updated 5 months ago