vllm-project / aibrixLinks
Cost-efficient and pluggable Infrastructure components for GenAI inference
☆4,104Updated this week
Alternatives and similar repositories for aibrix
Users that are interested in aibrix are comparing it to the libraries listed below
Sorting:
- Supercharge Your LLM with the Fastest KV Cache Layer☆5,030Updated this week
- vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization☆1,722Updated this week
- A Datacenter Scale Distributed Inference Serving Framework☆4,841Updated this week
- llm-d is a Kubernetes-native high-performance distributed LLM inference framework☆1,656Updated this week
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆1,861Updated last week
- A lightweight data processing framework built on DuckDB and 3FS.☆4,769Updated 5 months ago
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆3,846Updated this week
- The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.☆3,524Updated last week
- Sky-T1: Train your own O1 preview model within $450☆3,324Updated last month
- The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.☆1,284Updated this week
- Nano vLLM☆6,091Updated 2 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,397Updated 3 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆17,401Updated this week
- FlashInfer: Kernel Library for LLM Serving☆3,650Updated this week
- Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆6,332Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMs☆12,809Updated this week
- Democratizing Reinforcement Learning for LLMs☆4,074Updated last week
- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extra…☆2,733Updated last week
- The python library for real-time communication☆4,249Updated last week
- LLMPerf is a library for validating and benchmarking LLMs☆996Updated 8 months ago
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,904Updated 3 months ago
- Flexible and powerful framework for managing multiple AI agents and handling complex conversations☆6,422Updated last week
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆548Updated this week
- Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your …☆4,164Updated this week
- Expert Parallelism Load Balancer☆1,255Updated 5 months ago
- ☆3,525Updated 4 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,607Updated 3 weeks ago
- ☆3,464Updated 5 months ago
- AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-te…☆1,048Updated this week
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.☆2,856Updated 5 months ago