waitwaitforget / modelparallel_pytorch
Model Parallelism for pytorch training multiple network on multiple GPUs.
☆28Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for modelparallel_pytorch
- "Layer-wise Adaptive Rate Scaling" in PyTorch☆86Updated 3 years ago
- This repository is no longer maintained. Check☆82Updated 4 years ago
- Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"☆98Updated 3 years ago
- (Batched) advanced indexing for PyTorch.☆53Updated 10 months ago
- ☆61Updated 4 years ago
- ☆47Updated 3 years ago
- Filter Response Normalization tested on better ImageNet baselines.☆35Updated 4 years ago
- A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification f…☆44Updated 4 years ago
- AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks☆41Updated 6 years ago
- Implementation of the reversible residual network in pytorch☆101Updated 2 years ago
- Distributed, mixed-precision training with PyTorch☆89Updated 4 years ago
- NeurIPS'19: Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting (Pytorch implementation for class imbalance).☆34Updated 5 years ago
- PyTorch DataLoader processed in multiple remote computation machines for heavy data processings☆66Updated 5 years ago
- Non official pytorch implementation of i-Resnet, invertible residual networks.☆25Updated 2 years ago
- Exploiting Uncertainty of Loss Landscape for Stochastic Optimization☆15Updated 5 years ago
- PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆62Updated 3 months ago
- [ICLR 2019] ProbGAN: Towards Probabilistic GAN with Theoretical Guarantees☆32Updated 4 years ago
- PyTorch implementation of HashedNets☆36Updated last year
- Code for Self-Tuning Networks (ICLR 2019) https://arxiv.org/abs/1903.03088☆53Updated 5 years ago
- An implementation of shampoo☆74Updated 6 years ago
- Utilities for Pytorch☆89Updated 2 years ago
- Code for paper "SWALP: Stochastic Weight Averaging forLow-Precision Training".☆62Updated 5 years ago
- Unofficial pytorch implementation of ReZero in ResNet☆23Updated 4 years ago
- Piecewise Linear Functions (PWL) implementation in PyTorch☆48Updated 2 years ago
- ☆38Updated 4 years ago
- Implementation of soft parameter sharing for neural networks☆69Updated 3 years ago
- tunz's CUDA pytorch operator (MaskedSoftmax)☆74Updated 5 years ago
- ☆42Updated 5 years ago
- Improving generalization by controlling label-noise information in neural network weights.☆39Updated 3 years ago