lightseekorg/TorchSpec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lightseekorg/TorchSpec)

lightseekorg / TorchSpec

A PyTorch native library for training speculative decoding models

☆203

Alternatives and similar repositories for TorchSpec

Users that are interested in TorchSpec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sgl-project / SpecForge
View on GitHub
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆1,000Updated this week
lightseekorg / tokenspeed
View on GitHub
TokenSpeed is a speed-of-light LLM inference engine.
☆1,643Updated this week
lightseekorg / smg
View on GitHub
Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across vLLM, TRT-LLM, TokenSpeed, SGLang, OpenAI, Gemini &…
☆409Updated this week
jianuo-huang / Domino
View on GitHub
Official implementation of “Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding”.
☆121Updated this week
togethercomputer / aurora
View on GitHub
☆72Apr 30, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
vllm-project / speculators
View on GitHub
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
☆633Updated this week
sgl-project / genai-bench
View on GitHub
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…
☆314Updated this week
smart-lty / nano-PEARL
View on GitHub
Draft-Target Disaggregation LLM Serving System via Parallel Speculative Decoding.
☆210Mar 18, 2026Updated 4 months ago
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,761Updated this week
Dogacel / Attention-Drift
View on GitHub
Code for the paper *Attention Drift: What Speculative Decoding Models Learn*.
☆27May 12, 2026Updated 2 months ago
BBuf / AI-Infra-Auto-Driven-SKILLS
View on GitHub
☆692Jul 14, 2026Updated last week
mit-han-lab / fastrl
View on GitHub
[ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
☆174Feb 27, 2026Updated 4 months ago
ome-projects / ome
View on GitHub
Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…
☆481Updated this week
uccl-project / mKernel
View on GitHub
mKernel: fast multi-node, multi-GPU fused kernels
☆252Jun 21, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
inclusionAI / humming
View on GitHub
☆160Updated this week
iree-org / iree-tokenizer-py
View on GitHub
IREE Tokenizer Python Bindings
☆20Mar 5, 2026Updated 4 months ago
deepseek-ai / DeepSpec
View on GitHub
DeepSpec: a full-stack codebase for training and evaluating speculative decoding algorithms
☆6,719Jul 9, 2026Updated last week
electron-shaders / MineDraft
View on GitHub
☆38Jun 23, 2026Updated 3 weeks ago
perplexityai / pplx-kernels
View on GitHub
Perplexity GPU Kernels
☆591Nov 7, 2025Updated 8 months ago
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,994Updated this week
hyx1999 / SAM-Decoding
View on GitHub
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
☆52May 12, 2026Updated 2 months ago
MiniMax-AI / MSA
View on GitHub
☆380Jun 15, 2026Updated last month
vllm-project / vime
View on GitHub
An LLM post-training framework with vLLM for RL Scaling
☆379Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
liranringel / ddtree
View on GitHub
☆389Apr 16, 2026Updated 3 months ago
tile-ai / TileRT
View on GitHub
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
☆1,579Jul 14, 2026Updated last week
uccl-project / rdmatop
View on GitHub
htop-like TUI for real-time RDMA network monitoring.
☆76Jul 12, 2026Updated last week
SafeAILab / EAGLE
View on GitHub
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
☆2,473Feb 20, 2026Updated 5 months ago
kvcache-ai / TrEnv-X
View on GitHub
☆95Sep 15, 2025Updated 10 months ago
hemingkx / SpeculativeDecodingPapers
View on GitHub
📰 Must-read papers and blogs on Speculative Decoding ⚡️
☆1,277Jun 27, 2026Updated 3 weeks ago
NVIDIA / CompileIQ
View on GitHub
An Optimizer for Nvidia Compilers.
☆107Jul 3, 2026Updated 2 weeks ago
vllm-project / router
View on GitHub
A high-performance and light-weight router for vLLM large scale deployment
☆321Jul 13, 2026Updated last week
InternLM / turbomind
View on GitHub
☆96Mar 26, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Tencent / hpc-ops
View on GitHub
High Performance LLM Inference Operator Library
☆1,052Updated this week
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,569Updated this week
sgl-project / DeepGEMM
View on GitHub
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
☆32Updated this week
togethercomputer / xorl
View on GitHub
XoRL
☆17Updated this week
sgl-project / sglang-omni
View on GitHub
SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models
☆656Updated this week
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,941Updated this week
sgl-project / sgl-learning-materials
View on GitHub
Materials for learning SGLang
☆860Jan 5, 2026Updated 6 months ago