vllm-project / vllmLinks
A high-throughput and memory-efficient inference and serving engine for LLMs
☆67,633Updated last week
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below
Sorting:
- SGLang is a high-performance serving framework for large language models and multimodal models.☆22,556Updated this week
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.☆51,009Updated this week
- Large Language Model Text Generation Inference☆10,731Updated 2 weeks ago
- Fast and memory-efficient exact attention☆21,635Updated last week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆20,502Updated this week
- LLM inference in C/C++☆93,398Updated this week
- LlamaIndex is the leading framework for building LLM-powered agents over your data.☆46,355Updated last week
- Train transformer language models with reinforcement learning.☆17,005Updated last week
- 🦜🔗 The platform for reliable agents.☆124,611Updated this week
- Python bindings for llama.cpp☆9,917Updated 5 months ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆18,148Updated 2 months ago
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,544Updated this week
- DSPy: The framework for programming—not prompting—language models☆31,716Updated this week
- Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, o…☆9,311Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,815Updated last year
- Inference code for Llama models☆59,075Updated 11 months ago
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆33,981Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆65,942Updated this week
- Awesome-LLM: a curated list of Large Language Model☆26,035Updated 5 months ago
- The official Meta Llama 3 GitHub site☆29,185Updated 11 months ago
- Tensor library for machine learning☆13,840Updated last week
- Accessible large language models via k-bit quantization for PyTorch.☆7,896Updated last week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,375Updated 7 months ago
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…☆12,705Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,187Updated last month
- Open-source search and retrieval database for AI applications.☆25,639Updated this week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆26,197Updated 2 weeks ago
- Universal LLM Deployment Engine with ML Compilation☆21,896Updated 3 weeks ago
- Ongoing research training transformer models at scale☆14,939Updated this week
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"☆13,169Updated last year