bentoml / BentoVLLMLinks

Self-host LLMs with vLLM and BentoML

☆138

Alternatives and similar repositories for BentoVLLM

Users that are interested in BentoVLLM are comparing it to the libraries listed below

Sorting:

substratusai / vllm-docker
☆63Updated 4 months ago
anyscale / llm-router
Tutorial for building LLM router
☆220Updated last year
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆111Updated 3 months ago
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆59Updated 3 weeks ago
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
cfahlgren1 / observers
A Lightweight Library for AI Observability
☆249Updated 5 months ago
amogkam / llama_index_ray
Using LlamaIndex with Ray for productionizing LLM applications
☆71Updated 2 years ago
flowaicom / flow-judge
Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…
☆76Updated 9 months ago
aishwaryaprabhat / goku
GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling
☆132Updated 9 months ago
BhabhaAI / dataformer
Solving data for LLMs - Create quality synthetic datasets!
☆150Updated 6 months ago
google / lmeval
☆219Updated last month
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆288Updated 2 months ago
AlexBodner / How_Much_VRAM
☆102Updated 11 months ago
Not-Diamond / RoRF
Routing on Random Forest (RoRF)
☆181Updated 10 months ago
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆112Updated 2 months ago
backprop-ai / vllm-benchmark
Benchmarking the serving capabilities of vLLM
☆48Updated 11 months ago
sgl-project / sgl-project.github.io
This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.
☆63Updated this week
topoteretes / awesome-ai-memory
A list of AI memory projects
☆185Updated 6 months ago
langchain-ai / langchain-elastic
Elasticsearch integration into LangChain
☆58Updated 5 months ago
kolenaIO / autoarena
Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation
☆105Updated 7 months ago
EveripediaNetwork / fastc
Unattended Lightweight Text Classifiers with LLM Embeddings
☆185Updated 10 months ago
VectorInstitute / fed-rag
A framework for fine-tuning retrieval-augmented generation (RAG) systems.
☆124Updated 2 weeks ago
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆217Updated this week
mani-kantap / llm-inference-solutions
A collection of all available inference solutions for the LLMs
☆91Updated 5 months ago
langchain-ai / langchain-nvidia
☆160Updated last week
redis-developer / agentic-rag
Complete example of how to build an Agentic RAG architecture with Redis, Amazon Bedrock, and LlamaIndex.
☆96Updated 7 months ago
vllm-project / guidellm
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
☆438Updated last week
asprenger / ray_vllm_inference
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆69Updated last year
darshil3011 / AutoMetaRAG
Dynamic Metadata based RAG Framework
☆75Updated last year
chisasaw / redcache-ai
A memory framework for Large Language Models and Agents.
☆183Updated 7 months ago