substratusai / vllm-docker
☆49Updated 2 weeks ago
Alternatives and similar repositories for vllm-docker:
Users that are interested in vllm-docker are comparing it to the libraries listed below
- Self-host LLMs with vLLM and BentoML☆79Updated this week
- ☆52Updated 7 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆97Updated last month
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆37Updated 11 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆183Updated last month
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆44Updated 3 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆60Updated 9 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆58Updated 4 months ago
- ☆18Updated 4 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆58Updated 3 weeks ago
- Tutorial for building LLM router☆170Updated 5 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆82Updated last week
- ☆150Updated this week
- Develop, evaluate and monitor LLM applications at scale☆98Updated last month
- ☆68Updated 2 months ago
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated last year
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆138Updated 5 months ago
- experiments with inference on llama☆104Updated 7 months ago
- ☆42Updated last week
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆156Updated last year
- Python client library for improving your LLM app accuracy☆96Updated this week
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆39Updated 6 months ago
- ☆38Updated last year
- Tutorial to get started with SkyPilot!☆56Updated 8 months ago
- ☆74Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆15Updated this week
- ☆65Updated 7 months ago
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆58Updated last week