notaDestroyer/vllm-benchmark-suite

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/notaDestroyer/vllm-benchmark-suite)

notaDestroyer / vllm-benchmark-suite

Benchmarking tool for vLLM inference performance with GPU monitoring

☆52

Alternatives and similar repositories for vllm-benchmark-suite

Users that are interested in vllm-benchmark-suite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

keeeeenw / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆14Mar 30, 2024Updated 2 years ago
wyf111 / opencode-sop-engine
View on GitHub
Production-grade Skill orchestration, SOP enforcement, and long-context runtime control for OpenCode
☆18Mar 29, 2026Updated 3 months ago
D3voz / audiobook-maker-pro
View on GitHub
Fast, self-contained desktop audiobook creator with optimized local Chatterbox TTS, voice cloning, EPUB/PDF/TXT support, and optional TTS…
☆32Updated this week
autollama / autollama
View on GitHub
Anthropic's Contextual Retrieval implementation with visual chunk comparison. Preview context enrichment before/after embedding.
☆30Sep 25, 2025Updated 9 months ago
artoo-corporation / D2-Python
View on GitHub
Detect and Deny - Deterministic Function-Level Guardrails for AI Agents
☆18Jan 16, 2026Updated 6 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
Recklesz / FileAggregator-for-LLMs
View on GitHub
🔍 Dead-simple local file selector that preps your docs for LLM prompts, no cloud needed. Drop in your files, get perfectly formatted con…
☆11Jan 11, 2025Updated last year
OrbFrontend / Orb
View on GitHub
Agentic LLM RP Frontend
☆22Updated this week
jgreathouse9 / FDIDTutorial
View on GitHub
This repository contains the Python code to estimate the Forward and Augmented DID estimators.
☆12Dec 4, 2024Updated last year
esullivan-nvidia / fex_autoinstall
View on GitHub
Automated install script for Steam & FEX on Ubuntu 24.04
☆18Nov 10, 2025Updated 8 months ago
shubhomoydas / pyaad
View on GitHub
Implementation of Active Anomaly Discovery (AAD) in Python
☆15Dec 19, 2017Updated 8 years ago
codeforamerica / designforamerica
View on GitHub
An online application for designers to give back to local governments and simultaneously build their portfolio.
☆10Jun 20, 2011Updated 15 years ago
romiluz13 / pi-agent-skills
View on GitHub
☆19Apr 13, 2026Updated 3 months ago
purijatin / Distributed-Key-Value-DB
View on GitHub
Asynchronous Distributed Key Value Pair database
☆15Jan 17, 2014Updated 12 years ago
coder543 / llm-speed-benchmark
View on GitHub
A tool that can be used to measure the sequential performance of any OpenAI-compatible LLM API
☆25Aug 1, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xTimeCrystal / MiniModel
View on GitHub
☆41Feb 25, 2026Updated 4 months ago
neur0map / polymaster
View on GitHub
Monitor large transactions on Polymarket and Kalshi prediction markets with anomaly detection
☆25Feb 26, 2026Updated 4 months ago
davedgd / python-bootcamp
View on GitHub
Python Bootcamp Materials
☆11Dec 1, 2021Updated 4 years ago
RAZZULLIX / fast_topk_batched
View on GitHub
High-performance batched Top-K selection for CPU inference. Up to 80x faster than PyTorch, optimized for LLM sampling with AVX2 SIMD.
☆17Mar 20, 2026Updated 4 months ago
subingangadharan / cmu15418
View on GitHub
My solution code to parallel architecture and programming Spring 2016
☆12Aug 15, 2016Updated 9 years ago
Ashx098 / sft-play
View on GitHub
☆51Oct 1, 2025Updated 9 months ago
gnodisnait / nball4tree
View on GitHub
☆21Jul 15, 2024Updated 2 years ago
The-Responsible-AI-Initiative / LLM_Ethics_Benchmark
View on GitHub
Moral Operational Reasoning Assessment for Language Systems
☆22Apr 8, 2026Updated 3 months ago
LLM360 / website
View on GitHub
Website for LLM360
☆15Apr 27, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MirunaPislar / emoji2vec
View on GitHub
Train emoji embeddings based on emoji descriptions.
☆18Mar 31, 2018Updated 8 years ago
pranavkumaarofficial / nlcli-wizard
View on GitHub
Natural language control for Python CLI tools using locally-trained SLMs (CPU inference)
☆32Apr 10, 2026Updated 3 months ago
dalab / dissolve-struct
View on GitHub
Distributed solver library for large-scale structured output prediction, based on Spark. Project website:
☆17Mar 3, 2016Updated 10 years ago
time-palominodb / PalominoClusterTool
View on GitHub
A tool for creating large database clusters quickly.
☆21Dec 14, 2012Updated 13 years ago
alew3 / Turso_iOS
View on GitHub
Sample project to build and run Turso's SQLite fork on iOS and use vector search functionality on device
☆15Jul 26, 2024Updated last year
local-inference-lab / sparkinfer
View on GitHub
☆139Updated this week
nociza / cuti
View on GitHub
Container and workplace isolation for Claude Code and OpenClaw. Safely skip all permissions for claude.
☆23Jul 4, 2026Updated 2 weeks ago
k-koehler / gguf-tensor-overrider
View on GitHub
☆58Oct 10, 2025Updated 9 months ago
Ther-nullptr / circult-eda-mlsys-tinyml-arxiv-daily
View on GitHub
🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)
☆10Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Nero10578 / LLM-Inference-Benchmark
View on GitHub
☆14Aug 25, 2024Updated last year
TengHu / AutoCoder
View on GitHub
☆11Jan 28, 2024Updated 2 years ago
YATSEE-Labs / YATSEE
View on GitHub
YATSEE - Yet Another Tool for Speech Extraction & Enrichment
☆31Jun 29, 2026Updated 3 weeks ago
rpryzant / deconfounded_lexicon_induction
View on GitHub
☆21Nov 21, 2022Updated 3 years ago
matatonic / openedai-images
View on GitHub
An OpenAI API compatible images server to generate or manipulate images.
☆18Feb 2, 2025Updated last year
volker48 / disco-whisper
View on GitHub
Discord bot that will transcribe audio using OpenAI Whisper
☆12Jun 6, 2023Updated 3 years ago
mzbac / mlx_sharding
View on GitHub
Distributed Inference for mlx LLm
☆102Aug 1, 2024Updated last year