snuspl / nimbleLinks

Lightweight and Parallel Deep Learning Framework

☆264

Alternatives and similar repositories for nimble

Users that are interested in nimble are comparing it to the libraries listed below

Sorting:

NVIDIA / PyProf
A GPU performance profiling tool for PyTorch models
☆508Updated 4 years ago
ConstantPark / DL_Compiler
Study Group of Deep Learning Compiler
☆165Updated 2 years ago
kakaobrain / torchgpipe
A GPipe implementation in PyTorch
☆857Updated last year
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆200Updated 3 years ago
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆274Updated 2 months ago
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆136Updated 3 years ago
msr-fiddle / pipedream
☆393Updated 2 years ago
spcl / substation
Research and development for optimizing transformers
☆131Updated 4 years ago
awslabs / raf
☆145Updated 8 months ago
cmu-catalyst / collage
System for automated integration of deep learning backends.
☆47Updated 3 years ago
mlsys-seo / ooo-backprop
☆25Updated 2 years ago
NVIDIA / nvtx-plugins
Python bindings for NVTX
☆66Updated 2 years ago
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
mlcommons / training_results_v0.7
This repository contains the results and code for the MLPerf™ Training v0.7 benchmark.
☆57Updated 2 years ago
uwsampl / dtr-prototype
Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616
☆132Updated 2 years ago
facebookresearch / HolisticTraceAnalysis
A library to analyze PyTorch traces.
☆419Updated last week
ezyang / nvprof2json
Convert nvprof profiles into about:tracing compatible JSON files
☆70Updated 4 years ago
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
snuspl / parallax
A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.
☆132Updated 3 years ago
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆122Updated 3 years ago
jiazhihao / metaflow_sysml19
Repository for SysML19 Artifacts Evaluation
☆54Updated 6 years ago
microsoft / varuna
☆252Updated last year
pytorch / rfcs
PyTorch RFCs (experimental)
☆135Updated 5 months ago
andersy005 / tvm-in-action
TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together
☆64Updated 7 years ago
tlc-pack / TLCBench
Benchmark scripts for TVM
☆74Updated 3 years ago
google-research / sputnik
A library of GPU kernels for sparse matrix operations.
☆275Updated 4 years ago
jiazhihao / TASO
The Tensor Algebra SuperOptimizer for Deep Learning
☆730Updated 2 years ago
awslabs / lorien
☆42Updated 2 years ago
graphcore / poptorch
PyTorch interface for the IPU
☆181Updated 2 years ago
mlcommons / training_results_v1.0
This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
☆37Updated last year