geoffxy / habitat
๐ฎ Execution time predictions for deep neural network training iterations across different GPUs.
โ54Updated last year
Related projects โ
Alternatives and complementary repositories for habitat
- Model-less Inference Servingโ82Updated last year
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.โ54Updated 2 months ago
- Synthesizer for optimal collective communication algorithmsโ98Updated 7 months ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Correctionsโ114Updated 2 years ago
- An Efficient Pipelined Data Parallel Approach for Training Large Modelโ70Updated 3 years ago
- โ72Updated last year
- Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020โ125Updated 3 months ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applicationsโ124Updated 2 years ago
- An interference-aware scheduler for fine-grained GPU sharingโ108Updated 5 months ago
- โ35Updated 3 years ago
- โ37Updated 3 years ago
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)โ79Updated last year
- Paella: Low-latency Model Serving with Virtualized GPU Schedulingโ57Updated 6 months ago
- GVProf: A Value Profiler for GPU-based Clustersโ47Updated 7 months ago
- Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines.โ44Updated 11 months ago
- Fine-grained GPU sharing primitivesโ140Updated 4 years ago
- SOTA Learning-augmented Systemsโ32Updated 2 years ago
- NCCL Profiling Kitโ109Updated 4 months ago
- An experimental parallel training platformโ52Updated 7 months ago
- A Deep Learning Cluster Schedulerโ37Updated 3 years ago
- Artifacts for our ASPLOS'23 paper ElasticFlowโ52Updated 6 months ago
- REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU scheโฆโ85Updated last year
- โ16Updated last year
- โ44Updated last year
- FTPipe and related pipeline model parallelism research.โ41Updated last year
- A resilient distributed training frameworkโ85Updated 6 months ago
- โ65Updated 3 years ago
- โ38Updated 4 years ago
- โ47Updated last year
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.โ39Updated 2 years ago