Training neural networks in TensorFlow 2.0 with 5x less memory
☆137Feb 21, 2022Updated 4 years ago
Alternatives and similar repositories for checkmate
Users that are interested in checkmate are comparing it to the libraries listed below
Sorting:
- Thinking is hard - automate it☆18Aug 24, 2022Updated 3 years ago
- ML model training for edge devices☆168Sep 29, 2023Updated 2 years ago
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆133Jul 6, 2023Updated 2 years ago
- Experimental deep learning framework written in Rust☆15Nov 2, 2022Updated 3 years ago
- ☆13Feb 22, 2023Updated 3 years ago
- ☆22Nov 7, 2018Updated 7 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆125Jun 23, 2022Updated 3 years ago
- ☆15Jun 8, 2021Updated 4 years ago
- MobiSys#114☆23Aug 17, 2023Updated 2 years ago
- ☆41Jun 18, 2021Updated 4 years ago
- ☆15Apr 20, 2022Updated 3 years ago
- ☆17Dec 9, 2022Updated 3 years ago
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆75Dec 11, 2020Updated 5 years ago
- MONeT framework for reducing memory consumption of DNN training☆174May 4, 2021Updated 4 years ago
- ☆392Nov 4, 2022Updated 3 years ago
- Artifact repository for paper Automatic Generation of High-Performance Quantized Machine Learning Kernels☆17Oct 13, 2020Updated 5 years ago
- Fine-grained GPU sharing primitives☆147Jul 28, 2025Updated 7 months ago
- FTPipe and related pipeline model parallelism research.☆44May 16, 2023Updated 2 years ago
- ☆14Mar 10, 2024Updated 2 years ago
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆56Jul 21, 2021Updated 4 years ago
- ☆42Sep 8, 2023Updated 2 years ago
- An optimizing compiler for decision tree ensemble inference.☆18Jul 11, 2025Updated 8 months ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆740Jan 26, 2023Updated 3 years ago
- ☆10Aug 4, 2020Updated 5 years ago
- Summaries of readings in operating systems, networking and machine learning☆22Feb 4, 2019Updated 7 years ago
- ☆13Nov 1, 2021Updated 4 years ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆127May 9, 2022Updated 3 years ago
- Artifact for 'Register Optimizations for Stencils on GPUs'☆10Sep 18, 2018Updated 7 years ago
- Lightweight and Parallel Deep Learning Framework☆264Nov 26, 2022Updated 3 years ago
- Training and serving large-scale neural networks with auto parallelization.☆3,187Dec 9, 2023Updated 2 years ago
- ☆78May 4, 2021Updated 4 years ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,864Updated this week
- Pipeline Parallelism for PyTorch☆785Aug 21, 2024Updated last year
- ☆199Aug 31, 2019Updated 6 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆295Feb 23, 2024Updated 2 years ago
- Shared library for intercepting CUDA Runtime API calls. This was part of my Bachelor thesis: A Study on the Computational Exploitation of…☆14Jun 6, 2024Updated last year
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,003Sep 19, 2024Updated last year
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆69Dec 9, 2024Updated last year
- A Generic Resource-Aware Hyperparameter Tuning Execution Engine☆15Jan 8, 2022Updated 4 years ago