GPU environment and cluster management with LLM support
☆658May 16, 2024Updated last year
Alternatives and similar repositories for genv
Users that are interested in genv are comparing it to the libraries listed below
Sorting:
- A top-like tool for monitoring GPUs in a cluster☆84Feb 14, 2024Updated 2 years ago
- ☆286Feb 25, 2026Updated last week
- ☆221Feb 23, 2026Updated last week
- KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale☆1,160Updated this week
- markdown docs☆94Feb 1, 2026Updated last month
- Tensors, for human consumption☆1,360Jan 22, 2026Updated last month
- Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, o…☆9,516Updated this week
- dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or o…☆2,055Updated this week
- HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container☆278Feb 25, 2026Updated last week
- NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes☆2,572Updated this week
- The NVIDIA Driver Manager is a Kubernetes component which assist in seamless upgrades of NVIDIA Driver on each node of the cluster.☆49Feb 23, 2026Updated last week
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆476Updated this week
- ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing☆90Nov 13, 2025Updated 3 months ago
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆10,393Updated this week
- GPU plugin to the node feature discovery for Kubernetes☆307May 27, 2024Updated last year
- A Datacenter Scale Distributed Inference Serving Framework☆6,154Updated this week
- ☆794Feb 23, 2026Updated last week
- NVIDIA DRA Driver for GPUs☆574Feb 26, 2026Updated last week
- Practical GPU Sharing Without Memory Size Constraints☆306Mar 28, 2025Updated 11 months ago
- A JupyterLab extension for displaying dashboards of GPU usage.☆669Feb 23, 2026Updated last week
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆13Jan 30, 2026Updated last month
- Simple, safe way to store and distribute tensors☆3,645Updated this week
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆516Feb 23, 2026Updated last week
- MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integra…☆1,654Updated this week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆673Feb 26, 2026Updated last week
- Tools for building GPU clusters☆1,421Feb 23, 2026Updated last week
- Cost-efficient and pluggable Infrastructure components for GenAI inference☆4,650Updated this week
- ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.☆5,228Updated this week
- Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.