google-research / ml-for-systems-taxonomy
☆17Updated 3 years ago
Related projects: ⓘ
- ☆47Updated last year
- An IR for efficiently simulating distributed ML computation.☆24Updated 8 months ago
- Deadline-based hyperparameter tuning on RayTune.☆31Updated 4 years ago
- ParaDnn: A systematic performance analysis methodology for deep learning.☆39Updated 4 years ago
- [CF ’20] Verified Instruction-Level Energy Consumption Measurement for NVIDIA GPUs☆14Updated 3 years ago
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆13Updated 4 years ago
- An experimental parallel training platform☆46Updated 5 months ago
- 🔮 Execution time predictions for deep neural network training iterations across different GPUs.☆55Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆13Updated 5 years ago
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆54Updated 3 years ago
- Machine learning on serverless platform☆7Updated 2 years ago
- Mille Crepe Bench: layer-wise performance analysis for deep learning frameworks.☆17Updated 4 years ago
- Development repository for integrating FlexFlow (A distributed deep learning framework that supports flexible parallelization strategies)…☆28Updated 2 years ago
- ACT An Architectural Carbon Modeling Tool for Designing Sustainable Computer Systems☆33Updated last year
- MLPerf™ logging library☆30Updated last week
- ☆12Updated last year
- An Attention Superoptimizer☆19Updated 4 months ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆118Updated 2 weeks ago
- A Framework for Reasoning about System Performance using Causal AI☆41Updated 2 years ago
- FTPipe and related pipeline model parallelism research.☆41Updated last year
- How much energy do LLMs consume?☆40Updated last week
- Distributed tracing data from Meta's microservices architecture.☆16Updated last year
- Torch Frontend for IREE☆25Updated 8 months ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆33Updated last year
- Model-less Inference Serving☆78Updated 10 months ago
- Synthesizer for optimal collective communication algorithms☆94Updated 5 months ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 2 years ago
- ☆23Updated last year
- ☆22Updated 3 years ago
- General policies for MLPerf™ including submission rules, coding standards, etc.☆27Updated this week