Self-host LLMs with vLLM and BentoML
☆169Mar 3, 2026Updated 3 months ago
Alternatives and similar repositories for BentoVLLM
Users that are interested in BentoVLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple dependency injection framework for Python☆21May 15, 2024Updated 2 years ago
- ☆56Nov 18, 2024Updated last year
- ☆11Apr 25, 2021Updated 5 years ago
- Turn any OCR models into online inference API endpoint 🚀 🌖☆60Oct 29, 2025Updated 7 months ago
- API serving for your diffusers models☆11Jan 19, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Serving CrewAI Agent as REST API with BentoML, optionally with self-host open-source LLMs☆22May 8, 2026Updated last month
- ☆12Oct 25, 2023Updated 2 years ago
- ☆22Apr 17, 2025Updated last year
- This is "Your Private StackOverflow" app that helps you perform generative search in your code bases. This is built using open-source sta…☆11Aug 14, 2023Updated 2 years ago
- Chatbot-to-speech using Orpheus TTS model. Interactive console app.☆21May 1, 2025Updated last year
- how to build a sentence embedding application using BentoML☆15Updated this week
- AutoML 2024: HPOD: Hyperparameter Optimization for Unsupervised Outlier Detection☆13Jul 12, 2024Updated last year
- LLMPerf is a library for validating and benchmarking LLMs☆1,119Dec 9, 2024Updated last year
- ☆22May 31, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆16Feb 21, 2026Updated 3 months ago
- ☆13Jul 5, 2023Updated 2 years ago
- ☆12May 20, 2022Updated 4 years ago
- Implementation of a simple frame identification approach (SimpleFrameId) described in the paper "Out-of-domain FrameNet Semantic Role Lab…☆15Apr 3, 2017Updated 9 years ago
- The search for the best Conversational AI pipeline☆14May 11, 2020Updated 6 years ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆1,223Jun 5, 2026Updated last week
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆29Dec 17, 2024Updated last year
- ☆345Updated this week
- ☆23May 23, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- PANiC - PAraphrasing Noun-Compounds☆15Apr 6, 2018Updated 8 years ago
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.☆12,352Updated this week
- Sample fastAPI Application to demonstrate OpenTelemetry instrumentation☆20Jul 26, 2025Updated 10 months ago
- Fine-tune an LLM to perform batch inference and online serving.☆120May 29, 2025Updated last year
- A simple node to download repos from HF specify a repo ID or File create a folder where you want to download the files then rename the fo…☆25Jul 14, 2025Updated 10 months ago
- The Runpod worker template for serving our large language model endpoints. Powered by vLLM.☆452Jun 4, 2026Updated last week
- Test Orchestrator for Performance and Scalability of AI pLatforms☆18May 26, 2026Updated 2 weeks ago
- VS Code extension for create remote workstations (sessions) using ClearML.☆16Mar 14, 2024Updated 2 years ago
- A high performance batching router optimises max throughput for text inference workload☆16Sep 6, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- LLM Beam Search Example Implementation☆13May 3, 2024Updated 2 years ago
- Reward Model을 이용하여 언어모델의 답변을 평가하기☆30Feb 23, 2024Updated 2 years ago
- SOC Analyst Level 1 Replacement using RAG LLM☆28Aug 16, 2024Updated last year
- This is a boilerplate which has dependencies for pyspark(3.3.0) mongo(>4.x) connectivity☆10May 3, 2024Updated 2 years ago
- ☆48Nov 8, 2023Updated 2 years ago
- ☆13Jan 30, 2025Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆55Updated this week