A Kubernetes operator for mxnet jobs
☆52Dec 1, 2021Updated 4 years ago
Alternatives and similar repositories for mxnet-operator
Users that are interested in mxnet-operator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Common APIs and libraries shared by other Kubeflow operator repositories.☆53May 28, 2023Updated 2 years ago
- Kubernetes Operator for AI and Bigdata Elastic Training☆91Jan 10, 2025Updated last year
- Studying GPU Multi-tenancy☆11Jan 11, 2019Updated 7 years ago
- Dynamic training with Apache MXNet reduces cost and time for training deep neural networks by leveraging AWS cloud elasticity and scale. …☆56Nov 25, 2022Updated 3 years ago
- Tools for ML/MXNet on Kubernetes.☆43Feb 11, 2018Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Experimental repository for a caffe2 operator☆16Dec 1, 2021Updated 4 years ago
- NVIDIA device plugin for Kubernetes☆15Sep 9, 2019Updated 6 years ago
- Kernel for Kubeflow in Jupyter Notebook☆65Aug 13, 2019Updated 6 years ago
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆522Updated this week
- [WIP] Open Source WakaTime Server☆14Feb 4, 2019Updated 7 years ago
- A simple tool for parsing the profile.json file of mxnet☆14Aug 1, 2018Updated 7 years ago
- PyTorch on Kubernetes☆309Dec 1, 2021Updated 4 years ago
- Deep exponential family models in MXNet/Gluon. Layers o' latents 💤☆17Oct 16, 2017Updated 8 years ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆127Feb 23, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆11May 22, 2017Updated 8 years ago
- GluonNLP tutorial for Pycon2019☆14Aug 16, 2019Updated 6 years ago
- ☆31Jun 15, 2021Updated 4 years ago
- ☆131Apr 19, 2021Updated 4 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,074Apr 3, 2026Updated last week
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,092May 22, 2023Updated 2 years ago
- High performance NCCL plugin for Bagua.☆15Sep 15, 2021Updated 4 years ago
- 👩🔬[Experimental] Easily train and serve ML models on Kubernetes, directly from your python code.☆31Nov 8, 2018Updated 7 years ago
- Information about the Kubeflow community including proposals and governance information.☆194Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Batch-scheduler based on K8s scheduling framework, related features have contributed to scheduler-plugins(Deprecated).☆25Aug 6, 2020Updated 5 years ago
- A collection of common util libraries for Go☆25Oct 25, 2020Updated 5 years ago
- 碩士論文文獻筆記(Deep Learning、Scheduling、Distributed、Kubernetes)☆51May 5, 2019Updated 6 years ago
- MXNet (AI/ML) bindings for the Crystal language.☆22Jul 16, 2021Updated 4 years ago
- Implemention of Capsule Net from the paper Dynamic Routing Between Capsules☆24Nov 12, 2017Updated 8 years ago
- Automatic tuning for ML model deployment on Kubernetes☆80Nov 1, 2024Updated last year
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆11Apr 1, 2020Updated 6 years ago
- Incubating project for xgboost operator☆77Dec 1, 2021Updated 4 years ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Jan 5, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Simulated large clusters for Kubernetes scheduler validation.☆15Jan 3, 2023Updated 3 years ago
- The DayTrader 3 benchmark sample, which is a Java EE 6 application built around the paradigm of an online stock trading system.☆11Nov 18, 2019Updated 6 years ago
- ☆123Nov 1, 2022Updated 3 years ago
- Go Abstraction for Allocating NVIDIA GPUs with Custom Policies☆122Apr 1, 2026Updated last week
- benchmark-for-spark☆18May 7, 2025Updated 11 months ago
- ☆23Mar 8, 2016Updated 10 years ago
- ☆10Jul 29, 2020Updated 5 years ago