substratusai / kubeai
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
☆719Updated this week
Alternatives and similar repositories for kubeai:
Users that are interested in kubeai are comparing it to the libraries listed below
- Helm chart for Ollama on Kubernetes☆368Updated this week
- Community-maintained Kubernetes config and Helm chart for Langfuse☆76Updated 3 weeks ago
- ☆122Updated this week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆257Updated this week
- Kubernetes AI Toolchain Operator☆531Updated this week
- Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Manageme…☆1,250Updated this week
- Automatic SRE Superpowers within your Kubernetes cluster☆343Updated this week
- Finetune LLMs on K8s by using Runbooks☆170Updated 5 months ago
- deployKF builds machine learning platforms on Kubernetes. We combine the best of Kubeflow, Airflow†, and MLflow† into a complete platform…☆397Updated 6 months ago
- AI-native (edge and LLM) proxy for agents. Handles all the pesky heavy lifting in building agentic apps -- fast ⚡️ query routing, seamle…☆1,648Updated this week
- Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI se…☆132Updated this week
- Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! 🐫☆138Updated 2 weeks ago
- kro | Kube Resource Orchestrator☆961Updated this week
- OpenTelemetry Instrumentation for AI Observability☆297Updated this week
- Gateway API Inference Extension☆150Updated this week
- An AI-Powered assistant for Kubernetes developers☆175Updated last year
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆706Updated 3 weeks ago
- ☆112Updated this week
- Kubernetes-native Job Queueing☆1,625Updated this week
- Your friendly and safe CLI Copilot☆255Updated 5 months ago
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes☆312Updated this week
- 🧬 Helix is a private GenAI stack for building AI applications with declarative pipelines, knowledge (RAG), API bindings, and first-class…☆415Updated this week
- 🪶 Lightweight OpenAI drop-in replacement for Kubernetes☆143Updated last year
- Self-host LLMs with vLLM and BentoML☆86Updated this week
- Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, ev…☆762Updated last week
- Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.☆750Updated this week
- Open Weight, tool-calling LLMs☆151Updated 3 months ago
- JobSet: a k8s native API for distributed ML training and HPC workloads☆187Updated this week
- This is a fork/refactoring of the ajmyyra/ambassador-auth-oidc project☆88Updated 10 months ago