substratusai / vllm-dockerLinks
β67Updated 9 months ago
Alternatives and similar repositories for vllm-docker
Users that are interested in vllm-docker are comparing it to the libraries listed below
Sorting:
- Self-host LLMs with vLLM and BentoMLβ163Updated this week
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β138Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.β59Updated 2 weeks ago
- A collection of all available inference solutions for the LLMsβ94Updated 10 months ago
- The backend behind the LLM-Perf Leaderboardβ11Updated last year
- β18Updated last year
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.β44Updated last year
- Tutorial to get started with SkyPilot!β58Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ114Updated 9 months ago
- β51Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inferenceβ63Updated 4 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ111Updated last year
- β198Updated last year
- β82Updated 2 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.β78Updated last year
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β81Updated last year
- experiments with inference on llamaβ103Updated last year
- π Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platformβ38Updated last year
- Using LlamaIndex with Ray for productionizing LLM applicationsβ71Updated 2 years ago
- Data preparation code for Amber 7B LLMβ94Updated last year
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ277Updated this week
- Tutorial for building LLM routerβ242Updated last year
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models accessβ¦β114Updated last year
- Open Implementations of LLM Analysesβ107Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hoursβ66Updated last year
- Benchmark suite for LLMs from Fireworks.aiβ85Updated this week
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first appβ¦β169Updated 2 years ago
- vLLM adapter for a TGIS-compatible gRPC server.β47Updated this week
- Python client library for improving your LLM app accuracyβ97Updated 11 months ago
- Simple examples using Argilla tools to build AIβ57Updated last year