ovshake/nano-vllm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ovshake/nano-vllm)

ovshake / nano-vllm

a fun and educational take on vLLM

☆211

Alternatives and similar repositories for nano-vllm

Users that are interested in nano-vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ChinmayK0607 / heiretsu
View on GitHub
Educational WIP
☆73Feb 16, 2026Updated 5 months ago
ysy-phoenix / evalhub
View on GitHub
All-in-one benchmarking platform for evaluating LLM.
☆15Nov 12, 2025Updated 8 months ago
Wenyueh / MinivLLM
View on GitHub
Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation
☆942Jul 22, 2026Updated last week
ananyahjha93 / libself
View on GitHub
PyTorch Lightning based framework to run experiments for self-supervised learning tasks.
☆10Feb 14, 2020Updated 6 years ago
microsoft / post-training-toolkit
View on GitHub
☆25Jan 28, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kvignesh1420 / cot-icl-lab
View on GitHub
A framework to meta-train transformers for causal ICL
☆11Jul 15, 2026Updated 2 weeks ago
aastroza / structured-generation-benchmark
View on GitHub
Structured Generation Evals
☆14Sep 25, 2024Updated last year
sgl-project / mini-sglang
View on GitHub
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
☆4,643May 17, 2026Updated 2 months ago
apple / ml-scalefit
View on GitHub
☆18Mar 3, 2026Updated 4 months ago
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,679Apr 26, 2026Updated 3 months ago
Better-Call-Paul / blackwell_gemm
View on GitHub
☆19Apr 26, 2026Updated 3 months ago
neelsomani / kv-marketplace
View on GitHub
Cross-GPU KV Cache Marketplace
☆26Nov 12, 2025Updated 8 months ago
yuzhaouoe / pretraining-data-packing
View on GitHub
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆24Aug 18, 2024Updated last year
zhehangdu / Newton-Muon
View on GitHub
The Newton-Muon optimizer
☆30Jun 5, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
backprop-ai / vllm-benchmark
View on GitHub
Benchmarking the serving capabilities of vLLM
☆59Aug 20, 2024Updated last year
aerlabsAI / nano-vllm
View on GitHub
☆15Mar 11, 2026Updated 4 months ago
ambisinister / mla-experiments
View on GitHub
Experiments on Multi-Head Latent Attention
☆101Aug 19, 2024Updated last year
IST-DASLab / llmq
View on GitHub
Quantized LLM training in pure CUDA/C++.
☆251Updated this week
HamzaElshafie / gpt-oss-20B
View on GitHub
A PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with cla…
☆238Dec 2, 2025Updated 7 months ago
steamship-core / ai-adventure-agent
View on GitHub
☆17Feb 14, 2024Updated 2 years ago
TroyDoesAI / AI_Research
View on GitHub
My Gen AI research
☆11Jun 3, 2024Updated 2 years ago
wafer-ai / gpu-perf-engineering-resources
View on GitHub
A curriculum for learning about gpu performance engineering, from scratch to what the frontier AI labs do
☆1,273Apr 27, 2026Updated 3 months ago
RiddleHe / nanochat
View on GitHub
The best ChatGPT that $100 can buy.
☆56Updated this week
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Elma-dev / TODa
View on GitHub
TODa: Tamazight Open Dataset
☆19Jan 13, 2025Updated last year
caiovicentino / apple-silicon-internals
View on GitHub
Reverse engineering toolkit for Apple Silicon private APIs. 55+ Apple Intelligence models mapped, Metal 4 ML pipeline, 1009 IOReport chan…
☆15Mar 26, 2026Updated 4 months ago
Maharshi-Pandya / gpu-stuff
View on GitHub
Repository for GPU related kernels for learning/testing purposes
☆19May 27, 2026Updated 2 months ago
junuxyz / tiny-speculators
View on GitHub
☆22Jul 20, 2026Updated last week
max-muoto / monty-dspy-rlm
View on GitHub
Example for a Monty-enabled RLM in DSPy
☆20Feb 16, 2026Updated 5 months ago
MDK8888 / vllmini
View on GitHub
A minimal implementation of vllm.
☆74Jul 27, 2024Updated 2 years ago
sanyalsunny111 / Looped-GPT
View on GitHub
Minimal and highly hackable implementation of Looped Transformers with GPT
☆25Mar 8, 2026Updated 4 months ago
omkaark / spotty
View on GitHub
Simple orchestration for EC2 spot containers
☆19Sep 27, 2024Updated last year
ovshake / cobra
View on GitHub
Code for COBRA: Contrastive Bi-Modal Representation Algorithm (https://arxiv.org/abs/2005.03687)
☆15Jul 6, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
muellerzr / smol-moe
View on GitHub
☆25Oct 10, 2025Updated 9 months ago
suryatejreddy / Memeify
View on GitHub
Code and Dataset for Memeify: A Large-scale Meme Generation System
☆25May 21, 2020Updated 6 years ago
rkinas / triton-resources
View on GitHub
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆496Mar 10, 2025Updated last year
IzumiSatoshi / Tune-A-Video
View on GitHub
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
☆12Feb 23, 2023Updated 3 years ago
huggingface / picotron
View on GitHub
Minimalistic 4D-parallelism distributed training framework for education purpose
☆2,260Aug 26, 2025Updated 11 months ago
ustc-ai-sgy / ustc-ai-sgy.github.io
View on GitHub
USTC-SGY 人工智能通识课主页
☆22Jun 19, 2025Updated last year
allenai / drug-combo-extraction
View on GitHub
☆22Oct 20, 2022Updated 3 years ago