qiulang/vllm-sglang-perf

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qiulang/vllm-sglang-perf)

qiulang / vllm-sglang-perf

Evaluate how vLLM and SGLang perform when running a small LLM model on a mid-range NVIDIA GPU

☆20

Alternatives and similar repositories for vllm-sglang-perf

Users that are interested in vllm-sglang-perf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

falcondai / pyrouge
View on GitHub
A Python wrapper for the ROUGE summarization evaluation package
☆14Aug 9, 2017Updated 8 years ago
Information-Fusion-Lab-Umass / ClinicalNotes_TimeSeries
View on GitHub
The repository for the paper "Predicting in-hospital mortality by combining clinical notes with time-series data"
☆12May 23, 2021Updated 5 years ago
neuralmagic / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆17Updated this week
yonseicasl / Nebula
View on GitHub
Nebula: Deep Neural Network Benchmarks in C++
☆13Jan 2, 2025Updated last year
lucy3 / whos_filtered
View on GitHub
☆15Oct 4, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mabusaa / argocd-course-app-of-apps
View on GitHub
Examples of App of Apps Pattern
☆10Jan 17, 2023Updated 3 years ago
dstibrany / LockManager
View on GitHub
LockManager with deadlock detection for implementing 2PL
☆13Mar 13, 2019Updated 7 years ago
gagan0123 / fix-image-rotation
View on GitHub
Fixes the rotation of the images based on EXIF data
☆15Apr 6, 2026Updated 3 months ago
ChicagoHAI / decsum
View on GitHub
Implementation for Decision-focused Summarization (EMNLP2021)
☆12Mar 14, 2022Updated 4 years ago
sciknoworg / deep-research
View on GitHub
AI-based deep research system
☆17Updated this week
vectornguyen76 / QN-Hackathon-CTA-Matrix
View on GitHub
Quy Nhon AI Hackathon 2022 - Challenge 2: Review Analytics - Top 1 Solution
☆11Sep 21, 2022Updated 3 years ago
invergent-ai / surogate-studio
View on GitHub
Enterprise-grade LLMOps platform to accelerate the development and deployment of generative AI applications.
☆16Feb 25, 2026Updated 4 months ago
llm-d-incubation / llm-d-infra
View on GitHub
llm-d helm charts and deployment examples
☆59May 1, 2026Updated 2 months ago
NAVER-Cloud-HyperCLOVA-X / OmniServe
View on GitHub
☆69Dec 29, 2025Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Tiiiger / templm
View on GitHub
Code release for "TempLM: Distilling Language Models into Template-Based Generators"
☆14Jul 21, 2022Updated 4 years ago
yonseicasl / Kite
View on GitHub
Kite: Architecture Simulator for RISC-V Instruction Set
☆21Mar 22, 2026Updated 3 months ago
SvTPM-impl / SvTPM
View on GitHub
vTPM with SGX protection
☆12May 30, 2019Updated 7 years ago
aws-neuron / deep-learning-containers
View on GitHub
AWS Neuron Deep Learning Containers (DLCs) are a set of Docker images for training and serving models on AWS Trainium and Inferentia inst…
☆23Updated this week
rayures / vTPM
View on GitHub
libtpms / swtpm software emulation of a Trusted Platform Module (TPM 1.2 and TPM 2.0) compile script
☆13Sep 16, 2020Updated 5 years ago
VumBleBot / Group-Activity
View on GitHub
ODQA Baseline 팀프로젝트 이슈/정보 저장용 레포입니다.
☆12May 22, 2021Updated 5 years ago
kenlimmj / fightin-words
View on GitHub
A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.
☆11May 26, 2026Updated last month
CogComp / MultiOpEd
View on GitHub
MULTIOPED: A Corpus of Multi-Perspective News Editorials.
☆12Aug 25, 2021Updated 4 years ago
multi-swe-bench / MagentLess
View on GitHub
☆13Jul 31, 2025Updated 11 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
phineas-pta / speech-synthesis-ngngngan
View on GitHub
python script to download & process data to train a speech-synthesis model of Vietnamese M.C. Nguyễn Ngọc Ngạn
☆15Aug 13, 2024Updated last year
embeddings-benchmark / arena
View on GitHub
Code for the MTEB Arena
☆25Jul 2, 2025Updated last year
allenai / chime
View on GitHub
Repository containing dataset, models and code associated with the CHIME project
☆18Aug 22, 2024Updated last year
NAVER-Cloud-HyperCLOVA-X / hcx-vllm-plugin
View on GitHub
vLLM plugin for HyperCLOVAX
☆15Jan 27, 2026Updated 5 months ago
happyfish100 / fastcfs-csi
View on GitHub
k8s CSI driver for FastCFS
☆13Mar 17, 2024Updated 2 years ago
dallascard / DWAC
View on GitHub
Deep Weighted Averaging Classifiers
☆22Feb 4, 2019Updated 7 years ago
lucapinello / bsub_jupyter
View on GitHub
Connect to a LSF main node directly or trough a ssh jump node, launch a jupyter notebook via bsub and open automatically a tunnel. The n…
☆20Oct 27, 2021Updated 4 years ago
keesun / demo-boot-web
View on GitHub
인프런, 스프링 웹 MVC 강좌 2부 코드
☆12Jan 26, 2019Updated 7 years ago
Zee05 / JSE-Stock-Market-Returns-Prediction-Using-Multivariate-Time-Series-Data
View on GitHub
An application of Multilayer Perceptron, Random Forest Regression and Recurrent Neural Networks (LSTM)
☆14Oct 3, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hpcaitech / ColossalAI-Pytorch-lightning
View on GitHub
☆24Nov 22, 2022Updated 3 years ago
ProsusAI / stack-eval
View on GitHub
Official implementation for the paper, StackEval: Benchmarking LLMs in Coding Assistance, https://arxiv.org/abs/2412.05288
☆20Oct 30, 2024Updated last year
huggingface / tgi-gaudi
View on GitHub
Large Language Model Text Generation Inference on Habana Gaudi
☆34Mar 20, 2025Updated last year
antsanchez / prompto
View on GitHub
Interact with various LLMs in your browser (LangChain.js, Angular)
☆17May 7, 2026Updated 2 months ago
iPieter / llmq
View on GitHub
A Scheduler for Batched LLM Inference
☆19Oct 5, 2025Updated 9 months ago
mixedbread-ai / maxsim-cpu
View on GitHub
☆57Jul 10, 2025Updated last year
iesl / CSFCube
View on GitHub
A Test Collection of Computer Science Papers for Faceted Query by Example
☆23Nov 28, 2021Updated 4 years ago