icloud-ecnu/spotDNN

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/icloud-ecnu/spotDNN)

icloud-ecnu / spotDNN

spotDNN is a heterogeneity-aware spot instance provisioning framework to provide predictable performance for DDNN training workloads in the cloud.

☆15

Alternatives and similar repositories for spotDNN

Users that are interested in spotDNN are comparing it to the libraries listed below

Sorting:

icloud-ecnu / Prophet
View on GitHub
Prophet is a predictable communication scheduling strategy to schedule the gradient transfer in an adequate order, with the aim of maximi…
☆16Sep 13, 2023Updated 2 years ago
icloud-ecnu / delaystage
View on GitHub
DelayStage is a simple yet effective stage delay scheduling strategy to interleave the cluster resources across the parallel stages, so a…
☆14Sep 7, 2023Updated 2 years ago
icloud-ecnu / CCC2023
View on GitHub
☆12Sep 20, 2023Updated 2 years ago
icloud-ecnu / ebrowser
View on GitHub
ebrowser, an energy-efficient and lightweight human interaction framework without degrading the user experience in mobile Web browsers.
☆12Sep 7, 2023Updated 2 years ago
icloud-ecnu / ispot
View on GitHub
iSpot is a lightweight and cost-effective instance provisioning framework for Directed Acyclic Graph (DAG)-style big data analytics, in …
☆11Sep 7, 2023Updated 2 years ago
icloud-ecnu / Tetris
View on GitHub
Tetris, a model predictive control (MPC)-based container scheduling strategy to judiciously make migration decisions for long-running con…
☆25Dec 30, 2024Updated last year
icloud-ecnu / paper-reading-list
View on GitHub
Reading paper list for iCloud group
☆14Nov 22, 2025Updated 3 months ago
icloud-ecnu / lambdadnn
View on GitHub
λDNN is a cost-efficient function resource provisioning framework to minimize the monetary cost and guarantee the performance for DDNN tr…
☆23Oct 25, 2023Updated 2 years ago
icloud-ecnu / Opara
View on GitHub
Opara is a lightweight and resource-aware DNN Operator parallel scheduling framework to accelerate the execution of DNN inference on GPUs…
☆23Dec 19, 2024Updated last year
JLINEkai / CLOVER
View on GitHub
Cost-efficient and Instruction-driven AI Conversation in Digital Pathology
☆24Nov 5, 2025Updated 4 months ago
phylyd / QBNN
View on GitHub
Quantum Binary Neural Networks
☆15Oct 20, 2019Updated 6 years ago
RUCBM / LeaF
View on GitHub
☆12Nov 2, 2025Updated 4 months ago
pnnl / nwqbench
View on GitHub
☆12Jul 18, 2024Updated last year
hiddenlayer2020 / ML-Job-Scheduler-MLFS
View on GitHub
☆11Dec 18, 2020Updated 5 years ago
Quantinuum / pytket-dqc
View on GitHub
☆16Updated this week
umassos / GAIA
View on GitHub
☆12Mar 27, 2024Updated last year
michaelzhiluo / starburst
View on GitHub
Burstable Cloud Scheduler
☆16Jun 6, 2024Updated last year
JQub / QuantumFlow
View on GitHub
☆18Jun 3, 2021Updated 4 years ago
eth-easl / sailor
View on GitHub
AI model training on heterogeneous, geo-distributed resources
☆38Nov 24, 2025Updated 3 months ago
lovelyqian / ME-D2N_for_CDFSL
View on GitHub
Repository for the paper : ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning
☆22Mar 10, 2024Updated last year
indigoLovee / DQN
View on GitHub
DQN Pytorch
☆16Dec 13, 2021Updated 4 years ago
S-Lab-System-Group / ChronusArtifact
View on GitHub
☆23Jan 7, 2022Updated 4 years ago
microsoft / elasticflow-traces
View on GitHub
Integrated Training Platform (ITP) traces used in ElasticFlow paper.
☆31Dec 23, 2022Updated 3 years ago
pengyanghua / DL2
View on GitHub
a deep learning-driven scheduler for elastic training in deep learning clusters
☆31Jan 14, 2021Updated 5 years ago
ustc-hyin / ClearSight
View on GitHub
Code for paper: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models
☆52Dec 18, 2024Updated last year
hipersys-team / TopoOpt
View on GitHub
[NSDI 2023] TopoOpt: Optimizing the Network Topology for Distributed DNN Training
☆39Sep 10, 2024Updated last year
Relaxed-System-Lab / HexGen
View on GitHub
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆34May 6, 2024Updated last year
WonJoon-Yun / Quantum-Multi-Agent-Reinforcement-Learning
View on GitHub
Quantum Multi-agent Reinforcement Learning (QMARL)
☆43May 8, 2022Updated 3 years ago
l1nkr / DL-Compiler-Navigation
View on GitHub
Machine Learning Compiler Road Map
☆46Sep 12, 2023Updated 2 years ago
kzhang28 / Optimus
View on GitHub
An Efficient Dynamic Resource Scheduler for Deep Learning Clusters
☆41Oct 28, 2017Updated 8 years ago
pengyanghua / optimus
View on GitHub
A Deep Learning Cluster Scheduler
☆37Jan 11, 2021Updated 5 years ago
liupei101 / VLSA
View on GitHub
Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology (ICLR 2025)
☆66May 5, 2025Updated 10 months ago
Rivendile / Muri
View on GitHub
Artifacts for our SIGCOMM'22 paper Muri
☆43Dec 29, 2023Updated 2 years ago
ankan-ban / llama_cu_awq
View on GitHub
llama INT4 cuda inference with AWQ
☆54Jan 20, 2025Updated last year
maxwell0027 / PEFAT
View on GitHub
[CVPR2023]PEFAT: Boosting Semi-supervised Medical Image Classification via Pseudo-loss Estimation and Feature Adversarial Training
☆52Jun 25, 2023Updated 2 years ago
uclasystem / bamboo
View on GitHub
Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.
☆55Dec 11, 2022Updated 3 years ago
alibaba / alibaba-lingjun-dataset-2023
View on GitHub
☆64Jun 25, 2024Updated last year
S-Lab-System-Group / HeliosData
View on GitHub
Helios Traces from SenseTime
☆61Sep 27, 2022Updated 3 years ago
aldraus / quilt-llava
View on GitHub
Codebase for Quilt-LLaVA
☆83Jun 28, 2024Updated last year