General-Purpose Kubernetes Pod Controller
☆173Apr 4, 2023Updated 2 years ago
Alternatives and similar repositories for frameworkcontroller
Users that are interested in frameworkcontroller are comparing it to the libraries listed below
Sorting:
- Kubernetes Scheduler for Deep Learning☆264May 22, 2022Updated 3 years ago
- GPU analyzer for Kubernetes GPU clusters☆17Apr 11, 2020Updated 5 years ago
- Resource scheduling and cluster management for AI☆2,687Jun 6, 2024Updated last year
- Runtime for deep learning workload☆21May 24, 2022Updated 3 years ago
- More Flexible Device Extension Capability in Kubernetes (DevicePlugins++)☆25Jun 12, 2023Updated 2 years ago
- [EOL] A Firmament-based Kubernetes scheduler☆408Jul 19, 2021Updated 4 years ago
- Deep Learning Workspace☆204Jul 18, 2023Updated 2 years ago
- Batch-scheduler based on K8s scheduling framework, related features have contributed to scheduler-plugins(Deprecated).☆25Aug 6, 2020Updated 5 years ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,094May 22, 2023Updated 2 years ago
- A simulator of Kuberntes for batch and service workload.☆50Mar 26, 2021Updated 4 years ago
- NVIDIA device plugin for Kubernetes☆15Sep 9, 2019Updated 6 years ago
- GPU Sharing Scheduler for Kubernetes Cluster☆1,528Dec 29, 2023Updated 2 years ago
- Simulated large clusters for Kubernetes scheduler validation.☆15Jan 3, 2023Updated 3 years ago
- GPU Sharing Device Plugin for Kubernetes Cluster☆492Jan 10, 2023Updated 3 years ago
- Feasibility research for using kind to support e2e tests for kubebuilder(v2)-generated Kubernetes operators.☆13Jul 1, 2019Updated 6 years ago
- Common APIs and libraries shared by other Kubeflow operator repositories.☆53May 28, 2023Updated 2 years ago
- Extension to connect OpenPAI clusters, submit AI jobs, simulate jobs locally, manage files, and so on.☆15Dec 10, 2022Updated 3 years ago
- A LVM2 CSI plugin☆54Mar 21, 2020Updated 5 years ago
- Run your deep learning workloads on Kubernetes more easily and efficiently.☆531Mar 4, 2024Updated last year
- Canary release with helm (Deprecated since compass v2.8)☆13Sep 28, 2020Updated 5 years ago
- A collection of example for learning how to use Golang.☆14May 4, 2019Updated 6 years ago
- The schedule of the seminar☆25Dec 28, 2021Updated 4 years ago
- A Kubernetes Provider☆287Dec 5, 2025Updated 2 months ago
- Resource-adaptive cluster scheduler for deep learning training.☆454Mar 5, 2023Updated 2 years ago
- Cilium学习笔记☆23Aug 25, 2020Updated 5 years ago
- Automated Machine Learning on Kubernetes☆1,656Feb 18, 2026Updated last week
- An Efficient Dynamic Resource Scheduler for Deep Learning Clusters☆41Oct 28, 2017Updated 8 years ago
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆87Jun 7, 2022Updated 3 years ago
- Docker for Your ML/DL Models Based on OCI Artifacts☆474Jan 26, 2024Updated 2 years ago
- benchmark-for-spark☆18May 7, 2025Updated 9 months ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆127Feb 23, 2022Updated 4 years ago
- A marketplace which stores examples and job templates of openpai. Users could use openpaimarketplace to share their jobs or run-and-learn…☆33Dec 13, 2022Updated 3 years ago
- Fault-tolerant for DL frameworks☆70Jul 5, 2023Updated 2 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,035Updated this week
- very thin kubernetes device plugin which just exposes device files in host to containers.☆51Jan 7, 2026Updated last month
- Tools for monitoring NVIDIA GPUs on Linux☆1,068Nov 2, 2021Updated 4 years ago
- Product roadmap for Alibaba Cloud Container Services including ACK, ACR, ASK - Serverless K8S, ACK@Edge and ASM - Service Mesh☆33Nov 15, 2021Updated 4 years ago
- Go library/package to configure SRIOV networking devices☆30Dec 2, 2025Updated 2 months ago
- ☆32Jun 15, 2021Updated 4 years ago