☆67Mar 28, 2025Updated last year
Alternatives and similar repositories for vllm-docker
Users that are interested in vllm-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OpenAI compatible API for open source LLMs☆17Oct 30, 2023Updated 2 years ago
- LLMs as Collaboratively Edited Knowledge Bases☆45Feb 8, 2026Updated 2 months ago
- Graph model execution API for Candle☆17Jul 27, 2025Updated 8 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆55Apr 12, 2026Updated last week
- The Open-Source Implementation of Cognition AI's Automated Software Engineer, Devin.☆16Mar 13, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.☆14Jul 12, 2025Updated 9 months ago
- Reference implementation for the climate segmentation benchmark, based on the Exascale Deep Learning for Climate Analytics work☆10May 6, 2020Updated 5 years ago
- A powerful and user-friendly tool that generates detailed captions for your images☆21Nov 11, 2024Updated last year
- 基于 CUDA Driver API 的 cuda 运行时环境☆16Jul 30, 2025Updated 8 months ago
- Machine Learning Inference Graph Spec☆21Jul 27, 2019Updated 6 years ago
- ☆12Dec 8, 2020Updated 5 years ago
- ☆15Nov 17, 2015Updated 10 years ago
- ☆28May 3, 2023Updated 2 years ago
- A high performance batching router optimises max throughput for text inference workload☆16Sep 6, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Applying Evaluation Driven Development (EDD) to aid in the design decision of RAG pipelines☆31Oct 20, 2023Updated 2 years ago
- ☆15Jun 1, 2019Updated 6 years ago
- experiments with inference on llama☆103Jun 6, 2024Updated last year
- Homeworks, Midterm, & Capstone from ML BookCamp☆16Jan 28, 2022Updated 4 years ago
- Prediction of the activity of molecules/ligands that have been tested to bind or not bind to Beta-Lactamases using machine learning cl…☆10Mar 5, 2026Updated last month
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆85Apr 25, 2025Updated 11 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 7 months ago
- ☆43Jul 23, 2015Updated 10 years ago
- ZED LiveLink Plugin for Unreal☆39Jan 22, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Simple HAProxy configuration linter☆10Apr 26, 2016Updated 9 years ago
- bundled swagger-ui pip package☆21Sep 4, 2025Updated 7 months ago
- OpenAI compatible API for TensorRT LLM triton backend☆219Aug 1, 2024Updated last year
- J.A.R.V.I.S is a very advanced virtual assistant who can automate almost all tasks of everything of PC & IoT. Just Say It.☆11Jul 29, 2021Updated 4 years ago
- ☆12Jun 17, 2025Updated 10 months ago
- Kotlin Multiplatform Home Workout Exercise App☆14Mar 5, 2025Updated last year
- Smart reproducible analytical pipeline inspection☆21Feb 13, 2026Updated 2 months ago
- Python Binding for Rust WhatLang, a language detection library☆14Jan 5, 2024Updated 2 years ago
- The API Gateway & Microservice Management Layer, built on NGINX☆11Jul 5, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Implementations of transformer models in pytorch☆14Jun 2, 2020Updated 5 years ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆48Sep 26, 2024Updated last year
- ☆20Feb 17, 2023Updated 3 years ago
- A2AMCP is a Agent2Agent MCP communication Server taking the concept from Google's Agent2Agent Protocol (A2A)☆19Jun 9, 2025Updated 10 months ago
- Automatic HasFlagNonAlloc method generator for C# & Unity☆17Nov 25, 2025Updated 4 months ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆34Mar 26, 2024Updated 2 years ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆63Sep 18, 2025Updated 7 months ago