jianweif / OptimalGradCheckpointingLinks

☆41

Alternatives and similar repositories for OptimalGradCheckpointing

Users that are interested in OptimalGradCheckpointing are comparing it to the libraries listed below

Sorting:

ucbrise / actnn
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
☆199Updated 2 years ago
papers-submission / structured_transposable_masks
Code for ICML 2021 submission
☆35Updated 4 years ago
cli99 / flops-profiler
pytorch-profiler
☆51Updated 2 years ago
yaozhewei / HAP
☆43Updated last year
uwsampl / dtr-prototype
Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616
☆132Updated 2 years ago
facebookresearch / AlphaNet
AlphaNet Improved Training of Supernet with Alpha-Divergence
☆100Updated 4 years ago
HazyResearch / fly
☆220Updated 2 years ago
csyhhu / MetaQuant
Codes for Accepted Paper : "MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization" in NeurIPS 2019
☆54Updated 5 years ago
cjf00000 / StatQuant
code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"
☆29Updated 5 years ago
aojunzz / NM-sparsity
☆243Updated 3 years ago
gatech-sysml / CompOFA
[ICLR 2021] CompOFA: Compound Once-For-All Networks For Faster Multi-Platform Deployment
☆24Updated 2 years ago
jundaf2 / INT8-Flash-Attention-FMHA-Quantization
☆158Updated 2 years ago
facebookresearch / DepthShrinker
[ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan …
☆72Updated 3 years ago
Distributed-AI / PipeTransformer
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021
☆56Updated 4 years ago
facebookresearch / AttentiveNAS
code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"
☆105Updated 4 years ago
lottery-ticket / rewinding-iclr20-public
☆69Updated 5 years ago
htqin / BiBERT
This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.
☆88Updated 2 years ago
BradMcDanel / sdgp
☆10Updated 3 years ago
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
Lightning-AI / forked-pdb
Python pdb for multiple processes
☆61Updated 5 months ago
XinDongol / DNNAC
All about acceleration and compression of Deep Neural Networks
☆33Updated 6 years ago
kaiyuyue / torchshard
Slicing a PyTorch Tensor Into Parallel Shards
☆301Updated 5 months ago
VITA-Group / UVC
[ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Li…
☆53Updated last year
stanford-futuredata / stk
☆112Updated last year
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆137Updated 3 years ago
eedalong / Dpex
Distributed DataLoader For Pytorch Based On Ray
☆24Updated 4 years ago
stevenygd / SWALP
Code for paper "SWALP: Stochastic Weight Averaging forLow-Precision Training".
☆62Updated 6 years ago
facebookresearch / NASViT
code for NASViT
☆67Updated 3 years ago
NUS-HPC-AI-Lab / LARS-ImageNet-PyTorch
Accuracy 77%. Large batch deep learning optimizer LARS for ImageNet with PyTorch and ResNet, using Horovod for distribution. Optional acc…
☆38Updated 4 years ago