Training neural networks in TensorFlow 2.0 with 5x less memory
☆137Feb 21, 2022Updated 4 years ago
Alternatives and similar repositories for checkmate
Users that are interested in checkmate are comparing it to the libraries listed below
Sorting:
- ☆13Feb 22, 2023Updated 3 years ago
- Thinking is hard - automate it☆18Aug 24, 2022Updated 3 years ago
- ☆15Jun 8, 2021Updated 4 years ago
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆133Jul 6, 2023Updated 2 years ago
- ☆22Nov 7, 2018Updated 7 years ago
- ☆15Apr 20, 2022Updated 3 years ago
- ML model training for edge devices☆168Sep 29, 2023Updated 2 years ago
- An optimizing compiler for decision tree ensemble inference.☆18Jul 11, 2025Updated 7 months ago
- Fine-grained GPU sharing primitives☆148Jul 28, 2025Updated 7 months ago
- MONeT framework for reducing memory consumption of DNN training☆174May 4, 2021Updated 4 years ago
- FTPipe and related pipeline model parallelism research.☆44May 16, 2023Updated 2 years ago
- A Generic Resource-Aware Hyperparameter Tuning Execution Engine☆15Jan 8, 2022Updated 4 years ago
- ☆13Nov 1, 2021Updated 4 years ago
- ☆392Nov 4, 2022Updated 3 years ago
- MobiSys#114☆23Aug 17, 2023Updated 2 years ago
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆75Dec 11, 2020Updated 5 years ago
- ☆14Mar 10, 2024Updated last year
- The Tensor Algebra SuperOptimizer for Deep Learning☆739Jan 26, 2023Updated 3 years ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆127May 9, 2022Updated 3 years ago
- ☆42Sep 8, 2023Updated 2 years ago
- Artifact repository for paper Automatic Generation of High-Performance Quantized Machine Learning Kernels☆17Oct 13, 2020Updated 5 years ago
- Lightweight and Parallel Deep Learning Framework☆264Nov 26, 2022Updated 3 years ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆40Feb 13, 2025Updated last year
- ☆17Dec 9, 2022Updated 3 years ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,863Updated this week
- ☆78May 4, 2021Updated 4 years ago
- Training and serving large-scale neural networks with auto parallelization.☆3,183Dec 9, 2023Updated 2 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,006Sep 19, 2024Updated last year
- Pipeline Parallelism for PyTorch☆785Aug 21, 2024Updated last year
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 3 years ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆199Apr 27, 2022Updated 3 years ago
- AI and Memory Wall☆226Mar 23, 2024Updated last year
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆69Dec 9, 2024Updated last year
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆56Jul 21, 2021Updated 4 years ago
- ddl-benchmarks: Benchmarks for Distributed Deep Learning☆36May 29, 2020Updated 5 years ago
- ☆22Dec 15, 2023Updated 2 years ago
- SOTA Learning-augmented Systems☆37May 21, 2022Updated 3 years ago
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- Artifact for 'Register Optimizations for Stencils on GPUs'☆10Sep 18, 2018Updated 7 years ago