wangshuai09 / vllm
View external linksLinks

A high-throughput and memory-efficient inference and serving engine for LLMs

☆39

Alternatives and similar repositories for vllm

Users that are interested in vllm are comparing it to the libraries listed below

Sorting:

cordercorder / knn-models
View on GitHub
A retrieval augmented sequence modeling toolkit implemented based on Fairseq
☆29Mar 3, 2023Updated 2 years ago
DeepLink-org / dlinfer
View on GitHub
☆74Updated this week
BBuf / flash-rwkv
View on GitHub
☆32May 26, 2024Updated last year
deepspeedai / DeepSpeed-Kernels
View on GitHub
☆71Mar 26, 2025Updated 10 months ago
vllm-project / vllm-ascend
View on GitHub
Community maintained hardware plugin for vLLM on Ascend
☆1,651Feb 7, 2026Updated last week
Ascend / AscendSpeed
View on GitHub
☆79Dec 15, 2023Updated 2 years ago
Atqarana / AI-Voicebot-for-Kids
View on GitHub
An interactive companion toy that engages kids with storytelling, singing, and encouragement for physical activities using advanced AI t…
☆10Oct 15, 2024Updated last year
zishen-ucap / LTX-Video-xDiT
View on GitHub
This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerate…
☆11Dec 31, 2024Updated last year
yanyachen / rFTRLProximal
View on GitHub
FTRL-Proximal Online Learning Algorithm
☆15May 22, 2017Updated 8 years ago
nvidia-riva / common
View on GitHub
Protocol buffers and other common resources.
☆13Jan 20, 2026Updated 3 weeks ago
Vprashant / s2-chunking-lib
View on GitHub
A library for structural-semantic chunking of documents.
☆12Oct 8, 2025Updated 4 months ago
gogongxt / nano-sglang
View on GitHub
☆120Updated this week
miradel51 / Self_Supervised_CWS
View on GitHub
This project has included related source codes and datasets of our EMNLP2021 paper
☆10May 28, 2022Updated 3 years ago
JuliaSIMD / TriangularSolve.jl
View on GitHub
rdiv!(::AbstractMatrix, ::UpperTriangular) and ldiv!(::LowerTriangular, ::AbstractMatrix)
☆12Nov 18, 2024Updated last year
xiaobing2007 / LBOcrTest
View on GitHub
ocr照片识别文字,包括裁剪图片,能识别中文和英文,是现有网上资源中识别率最好的
☆13Sep 20, 2016Updated 9 years ago
ymoslem / MT-Tools
View on GitHub
Collection of Common Machine Translation Tools
☆11Jul 26, 2022Updated 3 years ago
BinWang28 / FacEval
View on GitHub
EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization
☆13Mar 20, 2025Updated 10 months ago
helmholtz-analytics / mpi4torch
View on GitHub
An MPI wrapper for the pytorch tensor library that is automatically differentiable
☆10Mar 27, 2023Updated 2 years ago
luciusssss / answer-sentence-selection
View on GitHub
Character Embedding + ESIM + Focal Loss for Chinese Answer Sentence Selection
☆10Jan 4, 2020Updated 6 years ago
vbdi / epdserve
View on GitHub
[ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation
☆22May 29, 2025Updated 8 months ago
PASSIONLab / distributed_sddmm
View on GitHub
Distributed SDDMM Kernel
☆12Jul 8, 2022Updated 3 years ago
botbahlul / android-autosrt-v2
View on GitHub
ANDROID APP to AUTO GENERATE SUBTITLE FILE and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any audio/vide…
☆19May 5, 2024Updated last year
Muhtasham / llm-inference-simulator
View on GitHub
🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.
☆13Jul 12, 2025Updated 7 months ago
dfraze / binja_winmd
View on GitHub
win32json Parser for TypeLibrary creation
☆12Feb 10, 2022Updated 4 years ago
passlab / Examples
View on GitHub
LaTeX Examples Document Source
☆11Apr 9, 2024Updated last year
mllite / sklearn_explain
View on GitHub
Model explanation provides the ability to interpret the effect of the predictors on the composition of an individual score.
☆13Jan 21, 2021Updated 5 years ago
bnosac / dlib
View on GitHub
allowing R users to work with dlib through Rcpp
☆13Apr 11, 2018Updated 7 years ago
fengbinzhu / Doc2SoarGraph
View on GitHub
The repo of the Doc2SoarGraph framework
☆10Sep 17, 2024Updated last year
MANGA-UOFA / PTfer
View on GitHub
☆11Nov 13, 2024Updated last year
jmduarte / capstone-particle-physics-domain
View on GitHub
Website for Particle Physics Domain (UCSD Capstone)
☆12Oct 23, 2021Updated 4 years ago
cat538 / MxMoE
View on GitHub
[ICML 2025] MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
☆22Jul 4, 2025Updated 7 months ago
ml-distribution / phrase-finding
View on GitHub
新词发现分布式机器学习算法。
☆15Jul 21, 2014Updated 11 years ago
ACA-Lab-SJTU / token-ring
View on GitHub
☆13Jan 7, 2025Updated last year
BrightXiaoHan / optimum-ascend
View on GitHub
Optimized inference with Ascend and Hugging Face
☆12Apr 23, 2024Updated last year
SpRegTiling / sparse-register-tiling
View on GitHub
☆10Mar 2, 2024Updated last year
zhang677 / PCL-lite
View on GitHub
Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)
☆15Jan 6, 2026Updated last month
ChrisTimperley / RepairChain
View on GitHub
AIxCC: automated vulnerability repair via LLMs, search, and static analysis
☆11Jul 16, 2024Updated last year
ssbuild / aigc_evals
View on GitHub
aigc evals
☆10Dec 2, 2023Updated 2 years ago
VD44 / Rouge-L-Tensorflow
View on GitHub
ROUGE L metric implementation using tensorflow ops
☆12Sep 17, 2018Updated 7 years ago

wangshuai09 / vllmView external linksLinks

Alternatives and similar repositories for vllm

wangshuai09 / vllm
View external linksLinks