kzhang28/Optimus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kzhang28/Optimus)

kzhang28 / Optimus

An Efficient Dynamic Resource Scheduler for Deep Learning Clusters

☆41

Alternatives and similar repositories for Optimus

Users that are interested in Optimus are comparing it to the libraries listed below

Sorting:

stanford-futuredata / gavel
View on GitHub
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
☆137Jul 25, 2024Updated last year
pengyanghua / optimus
View on GitHub
A Deep Learning Cluster Scheduler
☆37Jan 11, 2021Updated 5 years ago
msr-fiddle / philly-traces
View on GitHub
☆198Aug 31, 2019Updated 6 years ago
SymbioticLab / Tiresias
View on GitHub
Tiresias is a GPU cluster manager for distributed deep learning training.
☆166May 7, 2020Updated 5 years ago
intel / nodus
View on GitHub
Simulated large clusters for Kubernetes scheduler validation.
☆15Jan 3, 2023Updated 3 years ago
S-Lab-System-Group / HeliosData
View on GitHub
Helios Traces from SenseTime
☆61Sep 27, 2022Updated 3 years ago
yylin1 / papers-notebook-with-scheduling
View on GitHub
碩士論文文獻筆記（Deep Learning、Scheduling、Distributed、Kubernetes）
☆51May 5, 2019Updated 6 years ago
pengyanghua / DL2
View on GitHub
a deep learning-driven scheduler for elastic training in deep learning clusters
☆31Jan 14, 2021Updated 5 years ago
ucbrise / caravel
View on GitHub
Studying GPU Multi-tenancy
☆11Jan 11, 2019Updated 7 years ago
HiEST / gpu-topo-aware
View on GitHub
GPU topology-aware scheduler
☆13Jul 7, 2017Updated 8 years ago
msr-fiddle / synergy
View on GitHub
☆52Dec 13, 2022Updated 3 years ago
microsoft / hivedscheduler
View on GitHub
Kubernetes Scheduler for Deep Learning
☆264May 22, 2022Updated 3 years ago
stanford-futuredata / POP
View on GitHub
Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021
☆28Dec 15, 2021Updated 4 years ago
SymbioticLab / Salus
View on GitHub
Fine-grained GPU sharing primitives
☆148Jul 28, 2025Updated 7 months ago
anandj91 / p3
View on GitHub
☆21Nov 29, 2022Updated 3 years ago
S-Lab-System-Group / ChronusArtifact
View on GitHub
☆23Jan 7, 2022Updated 4 years ago
petuum / adaptdl
View on GitHub
Resource-adaptive cluster scheduler for deep learning training.
☆454Mar 5, 2023Updated 3 years ago
cake-lab / perseus
View on GitHub
☆10Jul 5, 2023Updated 2 years ago
SeldonIO / trtis-k8s-scheduler
View on GitHub
Custom Scheduler to deploy ML models to TRTIS for GPU Sharing
☆11Apr 1, 2020Updated 5 years ago
jqlu / ackctl
View on GitHub
☆10Jul 29, 2020Updated 5 years ago
hiddenlayer2020 / ML-Job-Scheduler-MLFS
View on GitHub
☆11Dec 18, 2020Updated 5 years ago
thecooltechguy / mlbot
View on GitHub
A fast & easy way to train ML models in your cloud, directly from your laptop.
☆14Mar 28, 2022Updated 3 years ago
oslab-ewha / memwork
View on GitHub
Workload estimation tool from memory traces
☆16Dec 6, 2019Updated 6 years ago
SJTU-IPADS / wukong-cube
View on GitHub
A distributed in-memory store for temporal knowledge graphs
☆10Mar 20, 2024Updated last year
dyweb / gommon
View on GitHub
A collection of common util libraries for Go
☆25Oct 25, 2020Updated 5 years ago
stanford-mast / INFaaS
View on GitHub
Model-less Inference Serving
☆94Nov 4, 2023Updated 2 years ago
microsoft / elasticflow-traces
View on GitHub
Integrated Training Platform (ITP) traces used in ElasticFlow paper.
☆31Dec 23, 2022Updated 3 years ago
tencentyun / chdfs-hadoop-plugin
View on GitHub
the hadoop plugin for chdfs
☆14Updated this week
epfl-labos / eagle
View on GitHub
☆13Jan 16, 2019Updated 7 years ago
beloglazov / planetlab-workload-traces
View on GitHub
A set of CPU utilization traces from PlanetLab VMs collected during 10 random days in March and April 2011
☆30Nov 19, 2012Updated 13 years ago
alibaba / GPU-scheduler-for-deep-learning
View on GitHub
GPU-scheduler-for-deep-learning
☆210Nov 5, 2020Updated 5 years ago
pkusys / ElasticFlow
View on GitHub
Artifacts for our ASPLOS'23 paper ElasticFlow
☆55May 10, 2024Updated last year
hkust-adsl / kubernetes-scheduler-simulator
View on GitHub
Kubernetes Scheduler Simulator
☆125Jul 31, 2024Updated last year
volcano-retired / scheduler
View on GitHub
The scheduler of Volcano, built based on kubernetes-sigs/kube-batch
☆14Jul 7, 2019Updated 6 years ago
rh01 / deeprm
View on GitHub
Deep reinforcement learning for resource managment and job schedule. it is inspired by deeprm model and I will implement for in practica…
☆12Jun 14, 2019Updated 6 years ago
jiaxincao / model-factory
View on GitHub
Model factory is a ML training platform to help engineers to build ML models at scale
☆17Sep 27, 2021Updated 4 years ago
tbd-ai / tbd-tools
View on GitHub
☆12May 3, 2020Updated 5 years ago
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated last year
netx-repo / PipeSwitch
View on GitHub
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆127May 9, 2022Updated 3 years ago