hangqiu / ML-EXrayLinks
☆28Updated 3 years ago
Alternatives and similar repositories for ML-EXray
Users that are interested in ML-EXray are comparing it to the libraries listed below
Sorting:
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆56Updated 4 years ago
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32Updated last year
- ML model training for edge devices☆166Updated last year
- sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data☆64Updated last year
- Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727☆147Updated 10 months ago
- Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"☆22Updated 4 years ago
- [ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark☆111Updated 2 years ago
- Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight…☆63Updated last year
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆10Updated 3 years ago
- ☆100Updated last year
- A schedule language for large model training☆149Updated last week
- ☆20Updated 3 years ago
- GRACE - GRAdient ComprEssion for distributed deep learning☆139Updated last year
- Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.☆28Updated 4 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆27Updated 5 years ago
- This is a list of awesome edgeAI inference related papers.☆97Updated last year
- Set of datasets for the deep learning recommendation model (DLRM).☆47Updated 2 years ago
- ☆43Updated last year
- Model-less Inference Serving☆91Updated last year
- You Only Search Once: On Lightweight Differentiable Architecture Search for Resource-Constrained Embedded Platforms☆11Updated 2 years ago
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).☆14Updated 4 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆138Updated 2 years ago
- Multi-Instance-GPU profiling tool☆59Updated 2 years ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆40Updated 2 years ago
- ☆41Updated 4 years ago
- Experimental deep learning framework written in Rust☆15Updated 2 years ago
- A research library for pytorch-based neural network pruning, compression, and more.☆162Updated 2 years ago
- 🔮 Execution time predictions for deep neural network training iterations across different GPUs.☆63Updated 2 years ago
- ☆94Updated 3 years ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆201Updated 3 years ago