hemingkx/SWIFT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hemingkx/SWIFT)

hemingkx / SWIFT

[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

☆70

Alternatives and similar repositories for SWIFT

Users that are interested in SWIFT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hemingkx / Whisper
View on GitHub
[ACL 2026] Enabling Efficient Reasoning in LLMs via Black-box Persuasive Prompting
☆22Jan 9, 2026Updated 6 months ago
hemingkx / CDec
View on GitHub
Codes for our paper "Enhancing Continual Relation Extraction via Classifier Decomposition" (Findings of ACL2023)
☆10Nov 29, 2023Updated 2 years ago
dilab-zju / self-speculative-decoding
View on GitHub
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆230Feb 13, 2025Updated last year
hemingkx / SpecDec
View on GitHub
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
☆47Dec 9, 2023Updated 2 years ago
hemingkx / Spec-Bench
View on GitHub
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
☆401Apr 22, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
liyongqi67 / LTRGR
View on GitHub
☆21Aug 9, 2024Updated last year
hemingkx / TokenSkip
View on GitHub
[EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMs
☆224Nov 30, 2025Updated 7 months ago
hemingkx / SpeculativeDecodingPapers
View on GitHub
📰 Must-read papers and blogs on Speculative Decoding ⚡️
☆1,276Jun 27, 2026Updated 3 weeks ago
zhzihao / WikiGenBench
View on GitHub
WIKIGENBENCH: Exploring Full-length Wikipedia Generation under Real-World Scenario (COLING 2025)
☆13Jan 5, 2025Updated last year
QwenLM / ConsisEval
View on GitHub
☆14Jul 5, 2024Updated 2 years ago
wangjs9 / Muffin
View on GitHub
Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)
☆17Jul 2, 2024Updated 2 years ago
DualityRL / multi-attempt
View on GitHub
☆19Mar 10, 2025Updated last year
kaishxu / DFMed
View on GitHub
Code and data for "Medical Dialogue Generation via Dual Flow Modeling" (ACL 2023 Findings)
☆14Nov 22, 2023Updated 2 years ago
wangjs9 / CARE-master
View on GitHub
PyTorch implementation of CARE
☆16Oct 6, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
smart-lty / ParallelSpeculativeDecoding
View on GitHub
[ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length
☆169Dec 23, 2025Updated 6 months ago
Zanette-Labs / SpeculativeRejection
View on GitHub
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆56Oct 29, 2024Updated last year
junzhang-zj / LoRAM
View on GitHub
[ICLR 2025] Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
☆75Mar 29, 2025Updated last year
thunlp / FR-Spec
View on GitHub
[ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling
☆55Jul 15, 2025Updated last year
YiCheng98 / IntegrativeDecoding
View on GitHub
Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"
☆33Apr 12, 2025Updated last year
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated last year
WangHanLinHenry / SPA-RL-Agent
View on GitHub
Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"
☆89Sep 13, 2025Updated 10 months ago
formll / resolving-scaling-law-discrepancies
View on GitHub
☆19Nov 4, 2025Updated 8 months ago
wangjs9 / Aligned-dPM
View on GitHub
PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach
☆32Nov 6, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hyx1999 / SAM-Decoding
View on GitHub
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
☆52May 12, 2026Updated 2 months ago
sail-sg / SimLayerKV
View on GitHub
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
☆54Oct 18, 2024Updated last year
iwangjian / Color4Dial
View on GitHub
Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue (ACL Findings 2023)
☆21Nov 10, 2025Updated 8 months ago
liyongqi67 / GCoQA
View on GitHub
☆18Jun 24, 2025Updated last year
YiCheng98 / Cooper
View on GitHub
This repository provides the data and the codes used in the AAAI'24 paper, COOPER: Coordinating Specialized Agents towards a Complex Dial…
☆28Mar 1, 2024Updated 2 years ago
ModalityDance / MRM
View on GitHub
[SIGIR 2026] "One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment"
☆15Apr 21, 2026Updated 3 months ago
Parallel-Reasoning / APR
View on GitHub
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆144Dec 17, 2025Updated 7 months ago
ZNLP / Language-Imbalance-Driven-Rewarding
View on GitHub
[ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving
☆25Apr 6, 2026Updated 3 months ago
FFY0 / AdaKV
View on GitHub
The Official Implementation of Ada-KV [NeurIPS 2025]
☆139Nov 26, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wjhou / Radar
View on GitHub
[ACL 2025] RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection
☆34Jul 23, 2025Updated 11 months ago
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆25Sep 26, 2024Updated last year
ModalityDance / Awesome-Agent-as-a-Judge
View on GitHub
"A Survey on Agent-as-a-Judge"
☆138May 11, 2026Updated 2 months ago
sail-sg / LongSpec
View on GitHub
[ACL 2026 (Main)] LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆84Jul 14, 2025Updated last year
iwangjian / Midi-Tuning
View on GitHub
[ACL 2024] Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue
☆26Oct 18, 2025Updated 9 months ago
iwangjian / Coding-Tutor
View on GitHub
[ACL 2025 Findings] Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors
☆90Jun 2, 2025Updated last year
EIT-NLP / Distilling-CoT-Reasoning
View on GitHub
[ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".
☆22Feb 26, 2025Updated last year