ome-projects/ome

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ome-projects/ome)

ome-projects / ome

Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton

☆481

Alternatives and similar repositories for ome

Users that are interested in ome are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sgl-project / genai-bench
View on GitHub
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…
☆314Updated this week
lightseekorg / smg
View on GitHub
Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across vLLM, TRT-LLM, TokenSpeed, SGLang, OpenAI, Gemini &…
☆406Updated this week
sgl-project / rbg
View on GitHub
A workload for deploying LLM inference services on Kubernetes
☆262Updated this week
kubernetes-sigs / gateway-api-inference-extension
View on GitHub
Gateway API Inference Extension
☆718Updated this week
kubernetes-sigs / lws
View on GitHub
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
☆766Updated this week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
llm-d / llm-d
View on GitHub
Achieve state of the art inference performance with modern accelerators on Kubernetes
☆3,843Updated this week
bytedance / InfiniStore
View on GitHub
KV cache store for distributed LLM inference
☆425Nov 13, 2025Updated 8 months ago
volcano-sh / kthena
View on GitHub
Kubernetes-native AI serving platform for scalable model serving.
☆394Updated this week
ai-dynamo / nixl
View on GitHub
NVIDIA Inference Xfer Library (NIXL)
☆1,139Updated this week
sgl-project / sgl-learning-materials
View on GitHub
Materials for learning SGLang
☆860Jan 5, 2026Updated 6 months ago
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,925Updated this week
ai-dynamo / dynamo
View on GitHub
A Datacenter Scale Distributed Inference Serving Framework
☆7,540Updated this week
kai-scheduler / KAI-Scheduler
View on GitHub
KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale
☆1,401Updated this week
ovg-project / kvcached
View on GitHub
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
☆1,106Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
vllm-project / production-stack
View on GitHub
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
☆2,460Jul 13, 2026Updated last week
kubernetes-sigs / inference-perf
View on GitHub
GenAI inference performance benchmarking tool
☆211Updated this week
kubernetes-sigs / dra-driver-nvidia-gpu
View on GitHub
DRA Driver for NVIDIA GPUs
☆674Updated this week
llm-d / llm-d-router
View on GitHub
llm-d Router: The intelligent entry point for inference requests
☆261Updated this week
sgl-project / SpecForge
View on GitHub
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆997Updated this week
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,759Updated this week
sgl-project / sglang-jax
View on GitHub
JAX backend for SGL
☆311Updated this week
llm-d / llm-d-kv-cache
View on GitHub
Distributed KV cache scheduling & offloading libraries
☆162Updated this week
InftyAI / llmaz
View on GitHub
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
☆309Jan 26, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ai-dynamo / aiconfigurator
View on GitHub
Offline optimization of your disaggregated Dynamo graph
☆368Updated this week
lightseekorg / tokenspeed
View on GitHub
TokenSpeed is a speed-of-light LLM inference engine.
☆1,638Updated this week
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,988Updated this week
uccl-project / uccl
View on GitHub
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g…
☆1,467Updated this week
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆30,545Updated this week
run-ai / fake-gpu-operator
View on GitHub
☆295Jul 5, 2026Updated 2 weeks ago
ai-dynamo / grove
View on GitHub
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
☆242Updated this week
run-ai / runai-model-streamer
View on GitHub
☆330Updated this week
sgl-project / sglang-omni
View on GitHub
SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models
☆655Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kerthcet / github-workflow-as-kube
View on GitHub
Following the same workflows as Kubernetes. Widely used in InftyAI community.
☆13May 31, 2026Updated last month
MoonshotAI / checkpoint-engine
View on GitHub
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
☆970Jul 4, 2026Updated 2 weeks ago
sgl-project / mini-sglang
View on GitHub
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
☆4,607May 17, 2026Updated 2 months ago
ai-dynamo / modelexpress
View on GitHub
Model Express is a Rust-based component meant to be placed next to existing model inference systems to speed up their startup times and i…
☆86Updated this week
llumnix-project / llumnix-ray
View on GitHub
Efficient and easy multi-instance LLM serving
☆562Mar 12, 2026Updated 4 months ago
kubernetes-sigs / wg-serving
View on GitHub
WG Serving
☆38Mar 24, 2026Updated 3 months ago
Project-HAMi / HAMi
View on GitHub
Heterogeneous GPU Sharing on Kubernetes
☆3,987Updated this week