A from-scratch Prefill/Decode disaggregation inference engine for LLMs
☆164May 10, 2026Updated 2 weeks ago
Alternatives and similar repositories for nanoPD
Users that are interested in nanoPD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Lock-free elimination back-off stack☆13Jan 6, 2022Updated 4 years ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆50May 12, 2026Updated last week
- Data Plane Development Kit☆13Nov 10, 2025Updated 6 months ago
- Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapp…☆14Updated this week
- SocksDirect code repository☆20May 6, 2026Updated 2 weeks ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- HeteroRefactor: Refactoring for Heterogeneous Computing with FPGA☆11Mar 13, 2026Updated 2 months ago
- High Performance KV Cache Store for LLM☆54Updated this week
- F# LL(k) Parser generator.☆12Oct 26, 2022Updated 3 years ago
- WaveNet Vocoder Samples☆23Aug 23, 2019Updated 6 years ago
- Persistent Kernel + JIT-Injected Operators (CUDA)☆47Jan 27, 2026Updated 3 months ago
- ☆19Jan 8, 2026Updated 4 months ago
- [NSDI25] AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training☆31May 2, 2025Updated last year
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Sep 21, 2023Updated 2 years ago
- A naive interpreter for IR of NJU compiler principle lab3, to accelerate interpretation, the ir will be compiled to machine-friendly bina…☆16Jun 17, 2020Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The official implementation of Cross-Task Experience Sharing (COPS)☆29Oct 23, 2024Updated last year
- SGLang is a fast serving framework for large language models and vision language models.☆32Updated this week
- Python Tutor Package☆18Sep 6, 2023Updated 2 years ago
- ☆16Jan 9, 2017Updated 9 years ago
- NJU-IT侠社团网站系统,包括预约和后台等等...☆16May 11, 2022Updated 4 years ago
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)☆57Mar 14, 2025Updated last year
- Implementation of Multiobjective Tree-structured Parzen Estimator☆15Mar 19, 2024Updated 2 years ago
- Demo of Role-Based Access Control in LLM Vector Databases☆18Nov 27, 2023Updated 2 years ago
- 对llava官方代码的一些学习笔记☆29Oct 11, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆18Jul 11, 2021Updated 4 years ago
- eTran: Extensible Kernel Transport with eBPF☆48Apr 28, 2025Updated last year
- Public Network Latency Datasets gathered from two platforms: PlanetLab and Seattle☆18Jan 5, 2018Updated 8 years ago
- ☆13Mar 6, 2023Updated 3 years ago
- Managed collective communication service☆24Sep 2, 2024Updated last year
- ☆81Sep 15, 2025Updated 8 months ago
- ☆21Jun 9, 2025Updated 11 months ago
- EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System☆15Mar 31, 2019Updated 7 years ago
- A Simple CPP Static Analysis Framework☆21Jun 16, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- an implementation of parallel skills like amp, ddp, pp, tp for learning purposes☆14Nov 18, 2023Updated 2 years ago
- ☆16Mar 19, 2025Updated last year
- A machine model for line-rate programmable switches☆27Oct 8, 2016Updated 9 years ago
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆23Apr 13, 2026Updated last month
- ☆122Apr 10, 2026Updated last month
- ☆24Jul 7, 2024Updated last year
- Python Model-View-Controller application generator for automating creation of PyQt and PySide applications.☆13Apr 16, 2015Updated 11 years ago