llm-d / llm-d-kv-cache-managerLinks

Distributed KV cache coordinator

☆88

Alternatives and similar repositories for llm-d-kv-cache-manager

Users that are interested in llm-d-kv-cache-manager are comparing it to the libraries listed below

Sorting:

llm-d / llm-d-inference-scheduler
Inference scheduler for llm-d
☆106Updated this week
llm-d / llm-d-routing-sidecar
Incubating P/D sidecar for llm-d
☆16Updated 2 weeks ago
volcano-sh / kthena
☆71Updated last week
NVIDIA / knavigator
knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.
☆72Updated 4 months ago
BaizeAI / kcover
🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.
☆33Updated last week
sgl-project / ome
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
☆322Updated this week
NVIDIA / topograph
A toolkit for discovering cluster network topology.
☆84Updated last week
kubernetes-sigs / inference-perf
GenAI inference performance benchmarking tool
☆133Updated this week
fmperf-project / fmperf
Cloud Native Benchmarking of Foundation Models
☆44Updated 4 months ago
sgl-project / rbg
A workload for deploying LLM inference services on Kubernetes
☆117Updated this week
k82cn / kubesim
A simulator of Kuberntes for batch and service workload.
☆50Updated 4 years ago
llm-d / llm-d-inference-sim
A light weight vLLM simulator, for mocking out replicas.
☆58Updated last week
InftyAI / llmaz
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
☆270Updated last week
project-codeflare / multi-cluster-app-dispatcher
Holistic job manager on Kubernetes
☆115Updated last year
llm-d-incubation / llm-d-infra
llm-d helm charts and deployment examples
☆46Updated last week
ai-dynamo / grove
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
☆117Updated this week
NVIDIA / k8s-nim-operator
An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
☆140Updated last week
modelpack / modctl
Command-line tools for managing OCI model artifacts, which are bundled based on Model Spec
☆50Updated last week
kubernetes-sigs / dra-example-driver
Example DRA driver that developers can fork and modify to get them started writing their own.
☆109Updated last month
InftyAI / Manta
💫 A lightweight p2p-based cache system for model distributions on Kubernetes. Reframing now to make it an unified cache system with POSI…
☆24Updated 11 months ago
run-ai / fake-gpu-operator
☆168Updated last month
NVIDIA / go-gpuallocator
Go Abstraction for Allocating NVIDIA GPUs with Custom Policies
☆119Updated last week
kubernetes-sigs / jobset
JobSet: a k8s native API for distributed ML training and HPC workloads
☆282Updated last week
llm-d / llm-d-model-service
Simplified model deployment on llm-d
☆27Updated 5 months ago
kubernetes-sigs / kjob
KJob: Tool for CLI-loving ML researchers
☆39Updated last week
heyfey / vodascheduler
GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)
☆34Updated 2 years ago
volcano-sh / descheduler
The Volcano Descheduler
☆21Updated 10 months ago
kubernetes-sigs / wg-serving
WG Serving
☆31Updated last month
volcano-sh / resource-exporter
Resource Exporter for volcano scheduling, e.g. NUMA-Aware scheduling.
☆18Updated 6 months ago
coreweave / nccl-tests
NVIDIA NCCL Tests for Distributed Training
☆126Updated 2 weeks ago