A Kubernetes operator for mxnet jobs
☆52Dec 1, 2021Updated 4 years ago
Alternatives and similar repositories for mxnet-operator
Users that are interested in mxnet-operator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Common APIs and libraries shared by other Kubeflow operator repositories.☆53May 28, 2023Updated 3 years ago
- Kubernetes Operator for AI and Bigdata Elastic Training☆91Jan 10, 2025Updated last year
- Studying GPU Multi-tenancy☆11Jan 11, 2019Updated 7 years ago
- Dynamic training with Apache MXNet reduces cost and time for training deep neural networks by leveraging AWS cloud elasticity and scale. …☆56Nov 25, 2022Updated 3 years ago
- Experimental repository for a caffe2 operator☆16Dec 1, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- NVIDIA device plugin for Kubernetes☆15Sep 9, 2019Updated 6 years ago
- Kernel for Kubeflow in Jupyter Notebook☆65Aug 13, 2019Updated 6 years ago
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆528Jun 2, 2026Updated last week
- A simple tool for parsing the profile.json file of mxnet☆14Aug 1, 2018Updated 7 years ago
- PyTorch on Kubernetes☆310Dec 1, 2021Updated 4 years ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆127Feb 23, 2022Updated 4 years ago
- Clojure Package for MXNET☆69Jul 1, 2018Updated 7 years ago
- GluonNLP tutorial for Pycon2019☆14Aug 16, 2019Updated 6 years ago
- Experimental flow-based Kubernetes scheduler☆34Jan 4, 2018Updated 8 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Logging MXNet data for visualization in TensorBoard.☆324Nov 30, 2021Updated 4 years ago
- ☆131Apr 19, 2021Updated 5 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,112Updated this week
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,092May 22, 2023Updated 3 years ago
- 👩🔬[Experimental] Easily train and serve ML models on Kubernetes, directly from your python code.☆31Nov 8, 2018Updated 7 years ago
- Information about the Kubeflow community including proposals and governance information.☆194Jun 2, 2026Updated last week
- Batch-scheduler based on K8s scheduling framework, related features have contributed to scheduler-plugins(Deprecated).☆26Aug 6, 2020Updated 5 years ago
- A collection of common util libraries for Go☆25Oct 25, 2020Updated 5 years ago
- Volume Controller for Kubernetes☆67Jan 3, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆16Nov 6, 2019Updated 6 years ago
- 碩士論文文獻筆記(Deep Learning、Scheduling、Distributed、Kubernetes)☆51May 5, 2019Updated 7 years ago
- MXNet (AI/ML) bindings for the Crystal language.☆22Jul 16, 2021Updated 4 years ago
- Automatic tuning for ML model deployment on Kubernetes☆80Nov 1, 2024Updated last year
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆12Apr 1, 2020Updated 6 years ago
- Incubating project for xgboost operator☆77Dec 1, 2021Updated 4 years ago
- Simulated large clusters for Kubernetes scheduler validation.☆15Jan 3, 2023Updated 3 years ago
- An OCM addon that automates the installation of Kubernetes' konnectivity servers and agents.☆53May 21, 2026Updated 3 weeks ago
- ☆123Nov 1, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Go Abstraction for Allocating NVIDIA GPUs with Custom Policies☆122Apr 21, 2026Updated last month
- benchmark-for-spark☆18May 7, 2025Updated last year
- ☆23Mar 8, 2016Updated 10 years ago
- ☆10Jul 29, 2020Updated 5 years ago
- Resource-adaptive cluster scheduler for deep learning training.☆459Mar 5, 2023Updated 3 years ago
- Production grade Kubernetes controller for managing AWS Services using CRDs☆16Apr 8, 2020Updated 6 years ago
- GPU Sharing Device Plugin for Kubernetes Cluster☆494Jan 10, 2023Updated 3 years ago