vllm-project / vllmLinks
A high-throughput and memory-efficient inference and serving engine for LLMs
β59,817Updated this week
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below
Sorting:
- SGLang is a fast serving framework for large language models and vision language models.β18,662Updated this week
- Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.β46,777Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β19,781Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.β7,146Updated this week
- Large Language Model Text Generation Inferenceβ10,550Updated 3 weeks ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,141Updated 4 months ago
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizatiβ¦β11,801Updated this week
- Fast and memory-efficient exact attentionβ19,864Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,680Updated last year
- Universal LLM Deployment Engine with ML Compilationβ21,440Updated last week
- LlamaIndex is the leading framework for building LLM-powered agents over your data.β44,665Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β17,938Updated this week
- Inference code for Llama modelsβ58,807Updated 8 months ago
- LLM inference in C/C++β87,385Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,647Updated last week
- Train transformer language models with reinforcement learning.β15,818Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagβ¦β29,802Updated this week
- Python bindings for llama.cppβ9,647Updated last month
- Open-source search and retrieval database for AI applications.β23,781Updated this week
- Retrieval and Retrieval-augmented LLMsβ10,655Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β23,699Updated last year
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,009Updated this week
- Tensor library for machine learningβ13,261Updated this week
- Ongoing research training transformer models at scaleβ13,755Updated last week
- Go ahead and axolotl questionsβ10,592Updated this week
- The definitive Web UI for local AI, with powerful features and easy setup.β45,135Updated this week
- A framework for few-shot evaluation of language models.β10,303Updated this week
- Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.β153,970Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β40,169Updated this week
- π¦π Build context-aware reasoning applicationsβ116,801Updated this week