google-research / ml-for-systems-taxonomy
☆18Updated 3 years ago
Alternatives and similar repositories for ml-for-systems-taxonomy:
Users that are interested in ml-for-systems-taxonomy are comparing it to the libraries listed below
- ☆47Updated 2 years ago
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆14Updated 4 years ago
- An experimental parallel training platform☆54Updated 10 months ago
- ☆44Updated last year
- An Attention Superoptimizer☆21Updated last month
- ☆14Updated last year
- ParaDnn: A systematic performance analysis methodology for deep learning.☆39Updated 4 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆34Updated 2 years ago
- ☆73Updated 3 years ago
- ☆16Updated 2 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Updated 5 years ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆10Updated 3 years ago
- 🔮 Execution time predictions for deep neural network training iterations across different GPUs.☆59Updated 2 years ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆128Updated this week
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).☆13Updated 3 years ago
- Mille Crepe Bench: layer-wise performance analysis for deep learning frameworks.☆17Updated 5 years ago
- Dynamic resources changes for multi-dimensional parallelism training☆22Updated 3 months ago
- Stateful LLM Serving☆46Updated 6 months ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆19Updated last year
- An IR for efficiently simulating distributed ML computation.☆27Updated last year
- ☆39Updated 4 years ago
- Set of datasets for the deep learning recommendation model (DLRM).☆41Updated 2 years ago
- ☆14Updated 3 years ago
- RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads☆42Updated 3 years ago
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆19Updated last week
- http://vlsiarch.eecs.harvard.edu/research/recommendation/☆133Updated 2 years ago
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆53Updated 6 months ago
- FTPipe and related pipeline model parallelism research.☆41Updated last year
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32Updated 9 months ago
- A resilient distributed training framework☆88Updated 10 months ago