lfedgeai / SPEARLinks
Distributed Cloud-Edge Collaborative AI Agent Platform
☆32Updated last week
Alternatives and similar repositories for SPEAR
Users that are interested in SPEAR are comparing it to the libraries listed below
Sorting:
- A light weight vLLM simulator, for mocking out replicas.☆85Updated this week
- Inference scheduler for llm-d☆124Updated last week
- Distributed KV cache scheduling & offloading libraries☆101Updated this week
- The main purpose of runtime copilot is to assist with node runtime management tasks such as configuring registries, upgrading versions, i…☆12Updated 2 years ago
- A toolkit for discovering cluster network topology.☆96Updated last week
- Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond☆773Updated this week
- Lightweight daemon for monitoring CUDA runtime API calls with eBPF uprobes☆146Updated 10 months ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆74Updated 6 months ago
- Cloud Native Benchmarking of Foundation Models☆45Updated 6 months ago
- 🎉 An awesome & curated list of best LLMOps tools.☆190Updated this week
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆35Updated this week
- A workload for deploying LLM inference services on Kubernetes☆168Updated last week
- Command-line tools for managing OCI model artifacts, which are bundled based on Model Spec☆61Updated this week
- llm-d benchmark scripts and tooling☆44Updated this week
- Cloud Native Artifacial Intelligence Model Format Specification☆175Updated last week
- Kubernetes-native AI serving platform for scalable model serving.☆198Updated last week
- llm-d helm charts and deployment examples☆48Updated last month
- A tool for coordinated checkpoint/restore of distributed applications with CRIU☆31Updated 5 months ago
- ☆38Updated 3 months ago
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆365Updated this week
- A library developed by Volcano Engine for high-performance reading and writing of PyTorch model files.☆25Updated last year
- ☆24Updated last week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆146Updated this week
- Push-Button End-to-End Testing of Kubernetes Operators and Controllers☆130Updated last month
- Kubernetes Container Runtime Interface proxy service with hardware resource aware workload placement policies☆178Updated 6 months ago
- Model Express is a Rust-based component meant to be placed next to existing model inference systems to speed up their startup times and i…☆25Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 4 months ago
- Systematic and comprehensive benchmarks for LLM systems.☆50Updated last week
- ☆17Updated 7 months ago
- Offline optimization of your disaggregated Dynamo graph☆177Updated last week