slwang-ustc/nano-vllm-v1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/slwang-ustc/nano-vllm-v1)

slwang-ustc / nano-vllm-v1

Nano vLLM with vLLM v1's request scheduling strategy and chunked prefill

☆91

Alternatives and similar repositories for nano-vllm-v1

Users that are interested in nano-vllm-v1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HarukiYqM / All-In-One-Neural-Composition
View on GitHub
PyTorch code for our paper "Resource-Adaptive Federated Learning with All-In-One Neural Composition" (NeurIPS2022)
☆19Dec 4, 2022Updated 3 years ago
Clark5 / Poseidon
View on GitHub
A NS-3 implementation of Poseidon congestion control algorithm (NSDI 2023).
☆34Jan 28, 2024Updated 2 years ago
LMCache / demo
View on GitHub
☆30Apr 17, 2025Updated last year
Workday / cpc
View on GitHub
☆24Jan 16, 2025Updated last year
hanchenye / polyaie
View on GitHub
An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE
☆17Aug 5, 2022Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
jcf94 / RDMA-wheel
View on GitHub
A Simple RDMA Wheel
☆22Mar 31, 2019Updated 7 years ago
oceanbase / kernel-advanced
View on GitHub
☆14Aug 9, 2023Updated 2 years ago
Cherries-Man / NS3-MP-RDMA
View on GitHub
Simulation of Multi-Path-RDMA algorithm based on ns-3
☆23May 12, 2024Updated 2 years ago
lumina-test / lumina
View on GitHub
Lumina is a user-friendly tool to test the correctness and performance of hardware network stacks.
☆29Jan 8, 2024Updated 2 years ago
byrzhm / hadoop-docker-cluster
View on GitHub
hadoop 的 docker 集群配置
☆10Jun 8, 2024Updated 2 years ago
cuhk-mass / SEPH
View on GitHub
☆19May 26, 2023Updated 3 years ago
HLRJ / Cpu0_For_LLVM17
View on GitHub
给llvm17.0.6添加一个新后端Cpu0
☆12Apr 22, 2024Updated 2 years ago
stephenneuendorffer / vyasa
View on GitHub
Xilinx Modifications to Halide
☆13May 3, 2021Updated 5 years ago
CRobeck / instrument-amdgpu-kernels
View on GitHub
LLVM/MLIR based compiler instrumentation of AMD GPU kernels
☆21Jul 13, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
a993096281 / YCSB-HWDB
View on GitHub
YCSB-C for HWDB!
☆18May 30, 2020Updated 6 years ago
HNUSystemsLab / HashEvaluation
View on GitHub
☆17Sep 26, 2022Updated 3 years ago
rehohoho / onnx2versal
View on GitHub
Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.
☆19Dec 29, 2024Updated last year
nqdtan / vck5000_vivado_ulp
View on GitHub
An alternative Vivado custom design example (to fully Vitis) for the User Logic Partition targeting VCK5000
☆14Jul 16, 2024Updated last year
iml130 / iree-template-cpp
View on GitHub
IREE C++ Template
☆17Jul 30, 2024Updated last year
ucbrise / cs294-ai-sys-sp22
View on GitHub
CS294 AI Systems Class Website
☆18Apr 25, 2022Updated 4 years ago
alexarmbr / matmul-playground
View on GitHub
☆29Apr 7, 2025Updated last year
SMILELab-FL / FedPETuning
View on GitHub
☆69Jun 2, 2023Updated 3 years ago
matrix97317 / OneNeuralNetwork
View on GitHub
This is a cross-chip platform collection of operators and a unified neural network library.
☆17Nov 3, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Xilinx / aie-rt
View on GitHub
☆25Jun 14, 2026Updated 2 weeks ago
CalvinXKY / InfraTech
View on GitHub
分享AI Infra知识&代码练习：PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
☆2,711Jun 22, 2026Updated last week
karnawhat / 6.172
View on GitHub
MIT 6.172 Performance Engineering of Software Systems
☆16Dec 30, 2021Updated 4 years ago
XiaoSongXS / dgemm-knl
View on GitHub
DGEMM on KNL, achieve 75% MKL
☆19May 19, 2022Updated 4 years ago
luotianyou349 / PnPDA
View on GitHub
This is the official implementation of ECCV2024 paper "Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Percepti…
☆19Aug 13, 2024Updated last year
ikuokuo / start-ai-compiler
View on GitHub
Start AI Compiler
☆50Feb 26, 2026Updated 4 months ago
NetX-lab / Ayo
View on GitHub
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
☆74Mar 11, 2026Updated 3 months ago
CMU-SAFARI / SPARTA
View on GitHub
A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…
☆22Jul 27, 2023Updated 2 years ago
MrShawCode / riscv-pke
View on GitHub
RISC-V Proxy Kernel for Education
☆29Dec 5, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xxyux / SpInfer
View on GitHub
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
☆68Mar 25, 2025Updated last year
Trinkle23897 / decaf-complier
View on GitHub
编译原理 2018秋 6次PA
☆30Jan 9, 2019Updated 7 years ago
llmsystem / llmsys_code_examples
View on GitHub
☆34Mar 31, 2026Updated 3 months ago
uwsampl / sparsetir-artifact
View on GitHub
Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"
☆25Feb 24, 2023Updated 3 years ago
DistSysCorp / infra-interview
View on GitHub
interview data structures and algorithms
☆46Apr 22, 2024Updated 2 years ago
HarliWu / FedBiOT
View on GitHub
An official implementation of "FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model", which has been accepted by KDD'2…
☆62Mar 3, 2025Updated last year
xiaoyu1998 / llvm-cpu0
View on GitHub
LLVM Backend tutorial Cpu0
☆25Nov 5, 2023Updated 2 years ago