project-codeflare / mlbatchLinks
Queuing and quota management for AI/ML batch jobs on Kubernetes
☆14Updated 4 months ago
Alternatives and similar repositories for mlbatch
Users that are interested in mlbatch are comparing it to the libraries listed below
Sorting:
- Run Slurm as a Kubernetes scheduler. A Slinky project.☆50Updated last week
- Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.☆236Updated 2 years ago
- Estimate resources needed to train LLMs☆13Updated 9 months ago
- ☆57Updated last week
- Python library for Synthetic Data Generation☆51Updated this week
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆30Updated last year
- An intuitive, easy-to-use python interface for batch resource requesting, access, job submission, and observation. Simplifying the develo…☆32Updated this week
- Model Registry provides a single pane of glass for ML model developers to index and manage models, versions, and ML artifacts metadata. I…☆152Updated last week
- Run Slurm in Kubernetes☆327Updated this week
- Repository for open inference protocol specification☆60Updated 6 months ago
- Run Slurm on Kubernetes. A Slinky project.☆193Updated this week
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆47Updated last week
- ☆42Updated this week
- llm-d benchmark scripts and tooling☆33Updated this week
- Holistic job manager on Kubernetes☆115Updated last year
- Taxonomy tree that will allow you to create models tuned with your data☆287Updated 2 months ago
- A top-like tool for monitoring GPUs in a cluster☆85Updated last year
- Python library for Evaluation☆16Updated this week
- JobSet: a k8s native API for distributed ML training and HPC workloads☆282Updated this week
- Python client for Google Kaniko☆11Updated 3 years ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆140Updated this week
- Helm charts for llm-d☆50Updated 4 months ago
- Kubeflow Pipelines on Tekton☆182Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 2 months ago
- MIG Partition Editor for NVIDIA GPUs☆230Updated this week
- ☆12Updated last year
- Fybrik☆132Updated 2 months ago
- KJob: Tool for CLI-loving ML researchers☆39Updated this week
- An open source benchmarking framework for IT automation☆249Updated last week
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆113Updated this week