Lightweight and Parallel Deep Learning Framework
☆264Nov 26, 2022Updated 3 years ago
Alternatives and similar repositories for nimble
Users that are interested in nimble are comparing it to the libraries listed below
Sorting:
- Welcome to PeriFlow CLI ☁︎☆12Aug 3, 2023Updated 2 years ago
- Dotfile management with bare git☆21Mar 14, 2026Updated last week
- FriendliAI Model Hub☆90Jun 9, 2022Updated 3 years ago
- MIST: High-performance IoT Stream Processing☆18Mar 19, 2019Updated 7 years ago
- ☆15Jun 8, 2021Updated 4 years ago
- FMO (Friendli Model Optimizer)☆13Jan 8, 2025Updated last year
- Nemo: A flexible data processing system☆21Mar 12, 2018Updated 8 years ago
- [⛔️ DEPRECATED] Friendli: the fastest serving engine for generative AI☆50Jun 25, 2025Updated 8 months ago
- A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.☆131Feb 21, 2022Updated 4 years ago
- Ethereum VM fuzzer☆62Jul 14, 2021Updated 4 years ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆740Jan 26, 2023Updated 3 years ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆199Apr 27, 2022Updated 3 years ago
- ☆48Sep 7, 2024Updated last year
- Cruise: A Distributed Machine Learning Framework with Automatic System Configuration☆26Mar 19, 2019Updated 7 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆125Jun 23, 2022Updated 3 years ago
- DietCode Code Release☆65Jul 21, 2022Updated 3 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,003Sep 19, 2024Updated last year
- ☆21Jan 7, 2018Updated 8 years ago
- Apache Nemo (Incubating) - Data Processing System for Flexible Employment With Different Deployment Characteristics☆113Jul 1, 2025Updated 8 months ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆127May 9, 2022Updated 3 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 3 years ago
- Fine-grained GPU sharing primitives☆147Jul 28, 2025Updated 7 months ago
- ☆15Oct 4, 2022Updated 3 years ago
- ☆192Mar 28, 2023Updated 2 years ago
- Model-less Inference Serving☆94Nov 4, 2023Updated 2 years ago
- An external memory allocator example for PyTorch.☆16Aug 10, 2025Updated 7 months ago
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆183Apr 25, 2022Updated 3 years ago
- Resource-adaptive cluster scheduler for deep learning training.☆453Mar 5, 2023Updated 3 years ago
- ☆22Sep 7, 2019Updated 6 years ago
- ☆392Nov 4, 2022Updated 3 years ago
- GPU-scheduler-for-deep-learning☆209Nov 5, 2020Updated 5 years ago
- MONeT framework for reducing memory consumption of DNN training☆174May 4, 2021Updated 4 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆137Feb 21, 2022Updated 4 years ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,864Updated this week
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆295Feb 23, 2024Updated 2 years ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆121Oct 26, 2022Updated 3 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆144Mar 31, 2023Updated 2 years ago
- gossip: Efficient Communication Primitives for Multi-GPU Systems☆62Jul 1, 2022Updated 3 years ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆11Jan 20, 2022Updated 4 years ago