oujieww/ANPD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/oujieww/ANPD)

oujieww / ANPD

☆11

Alternatives and similar repositories for ANPD

Users that are interested in ANPD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

uw-mad-dash / decoding-speculative-decoding
View on GitHub
☆16Aug 19, 2024Updated last year
iLearn-Lab / ACL25-PTQ1.61
View on GitHub
☆15Apr 6, 2026Updated 3 months ago
MurongYue / LLM_MoT_cascade
View on GitHub
This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…
☆32Jun 1, 2024Updated 2 years ago
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Jul 15, 2026Updated last week
OpenGVLab / LLMPrune-BESA
View on GitHub
BESA is a differentiable weight pruning technique for large language models.
☆17Mar 4, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
thunlp / Ouroboros
View on GitHub
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆117Mar 20, 2025Updated last year
KaiLv69 / DuoDecoding
View on GitHub
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting
☆19Mar 4, 2025Updated last year
NJUNLP / MCSD
View on GitHub
Multi-Candidate Speculative Decoding
☆41Apr 22, 2024Updated 2 years ago
Oneflow-Inc / oneflow-lite
View on GitHub
☆17Jan 1, 2024Updated 2 years ago
raymin0223 / fast_robust_early_exit
View on GitHub
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
☆67Sep 28, 2024Updated last year
Jikai0Wang / OPT-Tree
View on GitHub
☆30May 24, 2025Updated last year
guoshikeji / taxi_ui_design
View on GitHub
open source taxi dispatch software 出行加打车软件UI设计效果图
☆14Dec 22, 2020Updated 5 years ago
Equationliu / Kangaroo
View on GitHub
[NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exitin…
☆72Jun 26, 2024Updated 2 years ago
dilab-zju / self-speculative-decoding
View on GitHub
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆230Feb 13, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
nanfangAlan / FSRFER
View on GitHub
a TensorFlow implementation of the paper "Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Ima…
☆13Nov 30, 2021Updated 4 years ago
hyx1999 / SAM-Decoding
View on GitHub
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
☆52May 12, 2026Updated 2 months ago
lfsszd / CS-Drafting
View on GitHub
Cascade Speculative Drafting
☆33Apr 2, 2024Updated 2 years ago
D3Mlab / cr-lt-kgqa
View on GitHub
CR-LT KGQA Dataset Repository
☆10Jun 1, 2025Updated last year
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
FasterDecoding / REST
View on GitHub
REST: Retrieval-Based Speculative Decoding, NAACL 2024
☆220Mar 5, 2026Updated 4 months ago
murdockhou / MultiPoseNet-tensorflow
View on GitHub
The TensorFlow implementation about Paper accepted on ECCV 2018
☆13Oct 29, 2018Updated 7 years ago
Egg-Hu / SMI
View on GitHub
[ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination
☆14Apr 29, 2025Updated last year
Luowaterbi / TokenRecycling
View on GitHub
[ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
☆29Nov 11, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
linfeng93 / BiTA
View on GitHub
An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.
☆29Apr 15, 2025Updated last year
zsweet / zsw_AI_model
View on GitHub
☆12Sep 25, 2018Updated 7 years ago
ACADLab / SA-DS
View on GitHub
☆15Jul 25, 2024Updated 2 years ago
DensoITLab / bitprune
View on GitHub
☆11Apr 5, 2023Updated 3 years ago
NieXC / pytorch-spm
View on GitHub
Pytorch implementation of Single-Stage Multi-Person Pose Machines (ICCV'19)
☆15Jan 15, 2020Updated 6 years ago
zhengzangw / Sequence-Scheduling
View on GitHub
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆93May 23, 2023Updated 3 years ago
GATECH-EIC / ShiftAddViT
View on GitHub
[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
h-jia / TTE
View on GitHub
☆13Jul 14, 2025Updated last year
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
VITA-Group / Q-Hitter
View on GitHub
☆15Jun 4, 2024Updated 2 years ago
HanGuo97 / lq-lora
View on GitHub
☆129Jan 22, 2024Updated 2 years ago
zsh2000 / MuvieNeRF
View on GitHub
[ICCV 2023] Code for "Multi-task View Synthesis with Neural Radiance Fields"
☆12Oct 2, 2023Updated 2 years ago
hemingkx / SpecDec
View on GitHub
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
☆47Dec 9, 2023Updated 2 years ago
Cheliosoops / BitQ
View on GitHub
☆10Apr 24, 2024Updated 2 years ago
zhxchd / SDAR_SplitNN
View on GitHub
Code for NDSS '25 paper "Passive Inference Attacks on Split Learning via Adversarial Regularization"
☆13Sep 16, 2024Updated last year
euiin / SMART
View on GitHub
SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…
☆12Jul 9, 2025Updated last year