zankner/Hydra

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zankner/Hydra)

zankner / Hydra

☆55

Alternatives and similar repositories for Hydra

Users that are interested in Hydra are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Equationliu / Kangaroo
View on GitHub
[NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exitin…
☆72Jun 26, 2024Updated 2 years ago
HArmonizedSS / HASS
View on GitHub
Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)
☆56Mar 14, 2025Updated last year
dilab-zju / self-speculative-decoding
View on GitHub
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆230Feb 13, 2025Updated last year
Luowaterbi / TokenRecycling
View on GitHub
[ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
☆29Nov 11, 2025Updated 8 months ago
BradMcDanel / sdgp
View on GitHub
☆10Feb 1, 2022Updated 4 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
FasterDecoding / Medusa
View on GitHub
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
☆2,758Jun 25, 2024Updated 2 years ago
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
VITA-Group / Q-Hitter
View on GitHub
☆15Jun 4, 2024Updated 2 years ago
sail-sg / LongSpec
View on GitHub
[ACL 2026 (Main)] LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆84Jul 14, 2025Updated last year
raymin0223 / fast_robust_early_exit
View on GitHub
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
☆67Sep 28, 2024Updated last year
HanNight / soft_self_consistency
View on GitHub
Code for ACL 2024 paper "Soft Self-Consistency Improves Language Model Agents"
☆25Sep 11, 2024Updated last year
FMInference / DejaVu
View on GitHub
☆359Apr 2, 2024Updated 2 years ago
hemingkx / SpeculativeDecodingPapers
View on GitHub
📰 Must-read papers and blogs on Speculative Decoding ⚡️
☆1,281Jun 27, 2026Updated 3 weeks ago
VITA-Group / Robust_Weight_Signatures
View on GitHub
[ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang
☆16May 4, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
sbi-benchmark / diffeqtorch
View on GitHub
DifferentialEquations.jl with PyTorch
☆11Oct 12, 2022Updated 3 years ago
rayleizhu / vllm-ra
View on GitHub
[ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts
☆39Feb 29, 2024Updated 2 years ago
bytedance / AffineQuant
View on GitHub
Official implementation of the ICLR 2024 paper AffineQuant
☆30Mar 30, 2024Updated 2 years ago
yc2367 / BBS-MICRO
View on GitHub
☆19Nov 11, 2024Updated last year
Infini-AI-Lab / TriForce
View on GitHub
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
☆281Aug 31, 2024Updated last year
bparli / bpfs
View on GitHub
Rust In-Memory Filesystem
☆18Nov 28, 2019Updated 6 years ago
apple / ml-reversal-blessing
View on GitHub
☆17Jul 31, 2025Updated 11 months ago
zhliu0106 / learning-to-refuse
View on GitHub
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
☆10Dec 13, 2024Updated last year
ScalingIntelligence / CATS
View on GitHub
☆33Nov 11, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hao-ai-lab / LookaheadDecoding
View on GitHub
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
☆1,340Mar 6, 2025Updated last year
hyx1999 / SAM-Decoding
View on GitHub
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
☆52May 12, 2026Updated 2 months ago
linfeng93 / BiTA
View on GitHub
An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.
☆29Apr 15, 2025Updated last year
guoshikeji / taxi_ui_design
View on GitHub
open source taxi dispatch software 出行加打车软件UI设计效果图
☆14Dec 22, 2020Updated 5 years ago
sanxing-chen / HittER
View on GitHub
Codebase for the EMNLP 2021 paper "HittER: Hierarchical Transformers for Knowledge Graph Embeddings".
☆12Nov 1, 2021Updated 4 years ago
lucidrains / speculative-decoding
View on GitHub
Explorations into some recent techniques surrounding speculative decoding
☆307Dec 22, 2024Updated last year
nanfangAlan / FSRFER
View on GitHub
a TensorFlow implementation of the paper "Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Ima…
☆13Nov 30, 2021Updated 4 years ago
jonnypei / acl23-preadd
View on GitHub
☆12Jul 25, 2023Updated 3 years ago
VanessB / mutinfo
View on GitHub
Mutual information estimators and benchmarks
☆14Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
fabio-deep / Distributed-Pytorch-Boilerplate
View on GitHub
Pytorch code for managing distributed training experiments.
☆21Mar 22, 2020Updated 6 years ago
flashinfer-ai / cutlass-viz
View on GitHub
☆65Apr 26, 2025Updated last year
dcmoyer / invariance-tutorial
View on GitHub
A tutorial on learned non-adversarial invariance in neural networks
☆14Dec 8, 2019Updated 6 years ago
hdong920 / LESS
View on GitHub
☆53May 13, 2024Updated 2 years ago
haizelabs / bijection-learning
View on GitHub
☆29Oct 22, 2024Updated last year
kuleshov-group / MODULoRA-Experiment
View on GitHub
Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…
☆13Dec 5, 2023Updated 2 years ago
apple / ml-recurrent-drafter
View on GitHub
☆226Jan 23, 2025Updated last year