Lightweight and Parallel Deep Learning Framework
☆264Nov 26, 2022Updated 3 years ago
Alternatives and similar repositories for nimble
Users that are interested in nimble are comparing it to the libraries listed below
Sorting:
- Welcome to PeriFlow CLI ☁︎☆12Aug 3, 2023Updated 2 years ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆739Jan 26, 2023Updated 3 years ago
- FriendliAI Model Hub☆90Jun 9, 2022Updated 3 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆125Jun 23, 2022Updated 3 years ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆199Apr 27, 2022Updated 3 years ago
- Dotfile management with bare git☆21Feb 19, 2026Updated last week
- ☆15Jun 8, 2021Updated 4 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,006Sep 19, 2024Updated last year
- MIST: High-performance IoT Stream Processing☆18Mar 19, 2019Updated 6 years ago
- DietCode Code Release☆65Jul 21, 2022Updated 3 years ago
- FMO (Friendli Model Optimizer)☆13Jan 8, 2025Updated last year
- A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.☆132Feb 21, 2022Updated 4 years ago
- Model-less Inference Serving☆94Nov 4, 2023Updated 2 years ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆127May 9, 2022Updated 3 years ago
- [⛔️ DEPRECATED] Friendli: the fastest serving engine for generative AI☆49Jun 25, 2025Updated 8 months ago
- Fine-grained GPU sharing primitives☆148Jul 28, 2025Updated 7 months ago
- ☆192Mar 28, 2023Updated 2 years ago
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆182Apr 25, 2022Updated 3 years ago
- ☆48Sep 7, 2024Updated last year
- Resource-adaptive cluster scheduler for deep learning training.☆454Mar 5, 2023Updated 2 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 3 years ago
- ☆21Jan 7, 2018Updated 8 years ago
- Ethereum VM fuzzer☆62Jul 14, 2021Updated 4 years ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,863Updated this week
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆143Mar 31, 2023Updated 2 years ago
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆56Jul 21, 2021Updated 4 years ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆11Jan 20, 2022Updated 4 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆137Feb 21, 2022Updated 4 years ago
- gossip: Efficient Communication Primitives for Multi-GPU Systems☆62Jul 1, 2022Updated 3 years ago
- ☆392Nov 4, 2022Updated 3 years ago
- GPU-scheduler-for-deep-learning☆210Nov 5, 2020Updated 5 years ago
- MONeT framework for reducing memory consumption of DNN training☆174May 4, 2021Updated 4 years ago
- Cruise: A Distributed Machine Learning Framework with Automatic System Configuration☆26Mar 19, 2019Updated 6 years ago
- ☆42Sep 8, 2023Updated 2 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆295Feb 23, 2024Updated 2 years ago
- Benchmark scripts for TVM☆74Mar 15, 2022Updated 3 years ago
- ☆26Dec 5, 2022Updated 3 years ago
- A GPipe implementation in PyTorch☆863Jul 25, 2024Updated last year
- General system research material (not limited to paper) reading notes.☆22Mar 17, 2021Updated 4 years ago