PhilJd / contiguous_pytorch_paramsLinks

Accelerate training by storing parameters in one contiguous chunk of memory.

☆290

Alternatives and similar repositories for contiguous_pytorch_params

Users that are interested in contiguous_pytorch_params are comparing it to the libraries listed below

Sorting:

kaiyuyue / torchshard
Slicing a PyTorch Tensor Into Parallel Shards
☆299Updated last month
yaysummeriscoming / DALI_pytorch_demo
Example code showing how to use Nvidia DALI in pytorch, with fallback to torchvision. Contains a few differences to the official Nvidia …
☆197Updated 5 years ago
majumderb / rezero
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆410Updated last year
NVIDIA / runx
Deep Learning Experiment Management
☆640Updated 2 years ago
cybertronai / pytorch-lamb
Implementation of https://arxiv.org/abs/1904.00962
☆376Updated 4 years ago
justheuristic / prefetch_generator
Simple package that makes your generator work in background thread
☆280Updated 3 years ago
awwong1 / torchprof
PyTorch layer-by-layer model profiler
☆606Updated 4 years ago
pabloppp / pytorch-tools
Useful PyTorch functions and modules that are not implemented in PyTorch by default
☆188Updated last year
prigoyal / pytorch_memonger
Experimental ground for optimizing memory of pytorch models
☆366Updated 7 years ago
PistonY / torch-toolbox
🛠 Toolbox to extend PyTorch functionalities
☆421Updated last year
alphadl / lookahead.pytorch
lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch
☆337Updated 5 years ago
yangkky / distributed_tutorial
☆261Updated 5 years ago
signatrix / regnet
Pytorch implementation of network design paradigm described in the paper "Designing Network Design Spaces"
☆185Updated last year
zhijian-liu / torchprofile
A general and accurate MACs / FLOPs profiler for PyTorch models
☆624Updated last week
mit-han-lab / lite-transformer
[ICLR 2020] Lite Transformer with Long-Short Range Attention
☆612Updated last year
ppwwyyxx / RAM-multiprocess-dataloader
Demystify RAM Usage in Multi-Process Data Loaders
☆194Updated 2 years ago
Alibaba-MIIL / TResNet
Official Pytorch Implementation of "TResNet: High-Performance GPU-Dedicated Architecture" (WACV 2021)
☆475Updated 7 months ago
Santosh-Gupta / SpeedTorch
Library for faster pinned CPU <-> GPU transfer in Pytorch
☆685Updated 5 years ago
Yonghongwei / Gradient-Centralization
A New Optimization Technique for Deep Neural Networks
☆537Updated 3 years ago
lonePatient / lookahead_pytorch
pytorch implement of Lookahead Optimizer
☆191Updated 3 years ago
ucbrise / actnn
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
☆200Updated 2 years ago
sacmehta / delight
DeLighT: Very Deep and Light-Weight Transformers
☆470Updated 4 years ago
csrhddlam / pytorch-checkpoint
☆165Updated 6 years ago
Lyken17 / pytorch-memonger
Sublinear memory optimization for deep learning. https://arxiv.org/abs/1604.06174
☆599Updated 5 years ago
narumiruna / pytorch-distributed-example
☆169Updated 4 years ago
szq0214 / MEAL-V2
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks. In NeurIPS 2020 workshop.
☆697Updated 3 years ago
1adrianb / pytorch-estimate-flops
Estimate/count FLOPS for a given neural network using pytorch
☆305Updated 3 years ago
facebookresearch / bitsandbytes
Library for 8-bit optimizers and quantization routines.
☆769Updated 2 years ago
mgrankin / over9000
Over9000 optimizer
☆426Updated 2 years ago
NVIDIA / PyProf
A GPU performance profiling tool for PyTorch models
☆503Updated 4 years ago