pengyanghua/optimus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pengyanghua/optimus)

pengyanghua / optimus

A Deep Learning Cluster Scheduler

☆36

Alternatives and similar repositories for optimus

Users that are interested in optimus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pengyanghua / DL2
View on GitHub
a deep learning-driven scheduler for elastic training in deep learning clusters
☆31Jan 14, 2021Updated 5 years ago
S-Lab-System-Group / ChronusArtifact
View on GitHub
☆23Jan 7, 2022Updated 4 years ago
kzhang28 / Optimus
View on GitHub
An Efficient Dynamic Resource Scheduler for Deep Learning Clusters
☆41Oct 28, 2017Updated 8 years ago
hiddenlayer2020 / ML-Job-Scheduler-MLFS
View on GitHub
☆13Dec 18, 2020Updated 5 years ago
alibaba / GPU-scheduler-for-deep-learning
View on GitHub
GPU-scheduler-for-deep-learning
☆213Nov 5, 2020Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
SymbioticLab / ModelKeeper
View on GitHub
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆36Jan 9, 2023Updated 3 years ago
SymbioticLab / Tiresias
View on GitHub
Tiresias is a GPU cluster manager for distributed deep learning training.
☆165May 7, 2020Updated 6 years ago
msr-fiddle / philly-traces
View on GitHub
☆198Aug 31, 2019Updated 6 years ago
reconfigurable-ml-pipeline / ipa
View on GitHub
Source code of IPA, https://escholarship.org/uc/item/2p0805dq
☆12Jun 27, 2024Updated 2 years ago
stanford-futuredata / POP
View on GitHub
Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021
☆28Dec 15, 2021Updated 4 years ago
XiaofeiTJU / KaiS
View on GitHub
☆47Jan 18, 2021Updated 5 years ago
petuum / adaptdl
View on GitHub
Resource-adaptive cluster scheduler for deep learning training.
☆459Mar 5, 2023Updated 3 years ago
DIR-LAB / deep-batch-scheduler
View on GitHub
RLScheduler: An AutomatedHPC Batch Job Scheduler Using Reinforcement Learning [SC'20]
☆70May 30, 2023Updated 3 years ago
S-Lab-System-Group / Lucid
View on GitHub
Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs
☆61May 21, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
yylin1 / papers-notebook-with-scheduling
View on GitHub
碩士論文文獻筆記（Deep Learning、Scheduling、Distributed、Kubernetes）
☆51May 5, 2019Updated 7 years ago
siasosp23 / artifacts
View on GitHub
☆24Aug 15, 2023Updated 2 years ago
S-Lab-System-Group / Awesome-DL-Scheduling-Papers
View on GitHub
☆333Jan 22, 2024Updated 2 years ago
S-Lab-System-Group / Awesome-ML-for-System
View on GitHub
SOTA Learning-augmented Systems
☆37May 21, 2022Updated 4 years ago
S-Lab-System-Group / HeliosData
View on GitHub
Helios Traces from SenseTime
☆63Sep 27, 2022Updated 3 years ago
lwangbm / Metis
View on GitHub
Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters with at Scale
☆19May 27, 2020Updated 6 years ago
pkusys / ElasticFlow
View on GitHub
Artifacts for our ASPLOS'23 paper ElasticFlow
☆56May 10, 2024Updated 2 years ago
aleasimulator / alea
View on GitHub
Advanced job scheduling simulator
☆18Nov 6, 2023Updated 2 years ago
microsoft / hivedscheduler
View on GitHub
Kubernetes Scheduler for Deep Learning
☆263May 22, 2022Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
angao / scheduler-framework-sample
View on GitHub
This repo is a sample for Kubernetes scheduler framework.
☆47Oct 9, 2021Updated 4 years ago
umassos / GAIA
View on GitHub
☆12Mar 27, 2024Updated 2 years ago
msr-fiddle / blox
View on GitHub
☆47Jul 4, 2024Updated 2 years ago
msr-fiddle / synergy
View on GitHub
☆54Dec 13, 2022Updated 3 years ago
microsoft / SelfTune
View on GitHub
SelfTune is an RL framework that enables systems and service developers to automatically tune various configuration parameters and other …
☆46May 31, 2024Updated 2 years ago
SymbioticLab / Salus
View on GitHub
Fine-grained GPU sharing primitives
☆149Jul 28, 2025Updated last year
SPEAR-UIC / CQSim
View on GitHub
☆41Apr 13, 2025Updated last year
joapolarbear / dpro
View on GitHub
Analysis for the traces from byteprofile
☆32Nov 21, 2023Updated 2 years ago
smallersoup / k8s-scheduler-extender-example
View on GitHub
An example of kubernetes scheduler extender
☆15Apr 12, 2019Updated 7 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Raphael-Hao / brainstorm
View on GitHub
Compiler for Dynamic Neural Networks
☆45Nov 13, 2023Updated 2 years ago
tanjunchen / sample-scheduler-framework
View on GitHub
这是一个简单的关于 k8s Scheduler Framework 自定义调度框架案例
☆10Feb 27, 2020Updated 6 years ago
hku-systems / naspipe
View on GitHub
☆14Jan 12, 2022Updated 4 years ago
ucbrise / hypersched
View on GitHub
Deadline-based hyperparameter tuning on RayTune.
☆32Jan 16, 2020Updated 6 years ago
S-Lab-System-Group / Primo
View on GitHub
Primo: Practical Learning-Augmented Systems with Interpretable Models
☆19Dec 26, 2023Updated 2 years ago
uw-mad-dash / shockwave
View on GitHub
Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]
☆46Nov 24, 2022Updated 3 years ago
heyfey / vodascheduler
View on GitHub
GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)
☆33Nov 11, 2023Updated 2 years ago