backprop-ai / vllm-benchmarkLinks

Benchmarking the serving capabilities of vLLM

☆48

Alternatives and similar repositories for vllm-benchmark

Users that are interested in vllm-benchmark are comparing it to the libraries listed below

Sorting:

h2oai / enterprise-h2ogpte
Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform
☆87Updated last month
substratusai / vllm-docker
☆63Updated 4 months ago
etalab-ia / albert-models
Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.
☆42Updated last year
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
asprenger / ray_vllm_inference
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆69Updated last year
promptslab / LLMtuner
FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)
☆240Updated last year
nyunAI / PruneGPT
☆51Updated last year
bentoml / BentoVLLM
Self-host LLMs with vLLM and BentoML
☆139Updated last week
QuixiAI / spectrum
☆129Updated 3 months ago
LLM360 / amber-data-prep
Data preparation code for Amber 7B LLM
☆91Updated last year
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆230Updated 9 months ago
fw-ai / benchmark
Benchmark suite for LLMs from Fireworks.ai
☆76Updated this week
daniel-furman / sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
☆77Updated 9 months ago
SalesforceAIResearch / SFR-RAG
☆77Updated 6 months ago
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 8 months ago
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆63Updated 11 months ago
mani-kantap / llm-inference-solutions
A collection of all available inference solutions for the LLMs
☆91Updated 5 months ago
HITsz-TMG / KaLM-Embedding
Code for KaLM-Embedding models
☆86Updated last month
AlexBodner / How_Much_VRAM
☆102Updated 11 months ago
michaelfeil / embed
A stable, fast and easy-to-use inference library with a focus on a sync-to-async API
☆45Updated 10 months ago
lapp0 / lm-inference-engines
Comparison of Language Model Inference Engines
☆222Updated 7 months ago
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆111Updated 3 months ago
flowaicom / flow-judge
Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…
☆76Updated 9 months ago
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated 2 months ago
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆88Updated 3 months ago
intel / neural-speed
An innovative library for efficient LLM inference via low-bit quantization
☆349Updated 11 months ago
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆266Updated 9 months ago
EmbeddedLLM / vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
☆87Updated this week
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆59Updated 3 weeks ago
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆218Updated this week