microsoft / openpai-runtime
Runtime for deep learning workload
☆20Updated 2 years ago
Alternatives and similar repositories for openpai-runtime:
Users that are interested in openpai-runtime are comparing it to the libraries listed below
- Extension to connect OpenPAI clusters, submit AI jobs, simulate jobs locally, manage files, and so on.☆14Updated 2 years ago
- Benchmarking Horovod and TF on Batch AI☆26Updated 6 years ago
- A marketplace which stores examples and job templates of openpai. Users could use openpaimarketplace to share their jobs or run-and-learn…☆33Updated 2 years ago
- Lightweight Deep Learning Model Training library based on PyTorch☆32Updated 2 years ago
- A Kubernetes operator for mxnet jobs☆53Updated 3 years ago
- This repository contains the results and code for the MLPerf™ Training v0.6 benchmark.☆42Updated last year
- PyTorch ObjectDetection Modules and ONNX ops☆18Updated last year
- NVIDIA Fleet Command is a hybrid-cloud platform for securely and remotely deploying, managing, and scaling AI across dozens or up to thou…☆13Updated 2 years ago
- This repository contains code and config files that accompany the blog post:☆16Updated 5 years ago
- State-of-the-art pretrained vision model from Bing Multimedia☆18Updated last year
- asv benchmarks for dask projects☆18Updated 2 years ago
- 🎱 A demonstration of existing machine learning toolkits on Kubernetes☆56Updated 6 years ago
- ☆32Updated 6 years ago
- ☆18Updated 6 years ago
- ☆11Updated 3 years ago
- Needles in Haystacks: On Classifying Tiny Objects in Large Images☆22Updated 5 years ago
- Colab notebooks for d2l-book☆11Updated 5 years ago
- Robotics Learning Note☆11Updated 6 years ago
- Starter Kit for the ACRV Robotic Vision Challenge 1☆13Updated 5 years ago
- Distributed ML Optimizer☆32Updated 3 years ago
- Tools for ML/MXNet on Kubernetes.☆45Updated 7 years ago
- General-Purpose Kubernetes Pod Controller☆175Updated 2 years ago
- Enabling reproducible Machine Learning research☆43Updated last year
- Monitor your GPUs whether they are on a single computer or in a cluster☆162Updated 5 years ago
- Development repository for integrating FlexFlow (A distributed deep learning framework that supports flexible parallelization strategies)…☆28Updated 3 years ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated 2 years ago
- Fabric Manager packaging for Debian☆14Updated 3 years ago
- ☆51Updated 4 years ago
- ☆14Updated 3 years ago
- benchmarking some transformer deployments☆26Updated 2 years ago