run-ai / genv
GPU environment and cluster management with LLM support
☆491Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for genv
- A top-like tool for monitoring GPUs in a cluster☆80Updated 8 months ago
- Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elas…☆629Updated 6 months ago
- Module, Model, and Tensor Serialization/Deserialization☆187Updated 3 weeks ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆332Updated 3 weeks ago
- A library to analyze PyTorch traces.☆300Updated last week
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆739Updated last week
- aim-mlflow integration☆193Updated last year
- ClearML - Model-Serving Orchestration and Repository Solution☆138Updated 2 months ago
- Controller for ModelMesh☆204Updated 3 months ago
- Practical GPU Sharing Without Memory Size Constraints☆225Updated last month
- ☆267Updated 2 months ago
- MLCube® is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.☆154Updated last month
- ☆251Updated last month
- Transform datasets at scale. Optimize datasets for fast AI model training.☆362Updated this week
- ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing☆62Updated 3 months ago
- Container plugin for Slurm Workload Manager☆289Updated this week
- markdown docs☆68Updated this week
- ClearML Agent - ML-Ops made easy. ML-Ops scheduler & orchestration solution☆240Updated 2 weeks ago
- Distributed Model Serving Framework☆154Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆250Updated last month
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆114Updated 9 months ago
- CUDA checkpoint and restore utility☆222Updated 6 months ago
- Repository for open inference protocol specification☆42Updated 3 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆183Updated 2 months ago
- A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.☆640Updated 2 weeks ago
- MIG Partition Editor for NVIDIA GPUs☆173Updated this week
- Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient 🚀☆112Updated last year
- A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way☆282Updated last month
- Install PyTorch distributions with computation backend auto-detection☆220Updated last year
- Slides, notes, and materials for the workshop☆305Updated 5 months ago