Lightning-AI / forked-pdb
Python pdb for multiple processes
☆32Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for forked-pdb
- (Batched) advanced indexing for PyTorch.☆53Updated 11 months ago
- Implementation of Kronecker Attention in Pytorch☆17Updated 4 years ago
- ☆29Updated 2 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- Partially Adaptive Momentum Estimation method in the paper "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep …☆39Updated last year
- Code for SegTree Transformer (ICLR-RLGM 2019).☆27Updated 5 years ago
- An adaptive training algorithm for residual network☆14Updated 4 years ago
- ICML2019 Accepted Paper. Overcoming Multi-Model Forgetting☆13Updated 5 years ago
- Fork of diux-dev/imagenet18☆14Updated 6 years ago
- Fast Discounted Cumulative Sums in PyTorch☆95Updated 3 years ago
- ☆25Updated 3 years ago
- Interpolation between Residual and Non-Residual Networks, ICML 2020. https://arxiv.org/abs/2006.05749☆26Updated 4 years ago
- Code release to reproduce ASHA experiments from "Random Search and Reproducibility for NAS."☆22Updated 5 years ago
- A pytorch realization of adafactor (https://arxiv.org/pdf/1804.04235.pdf )☆24Updated 5 years ago
- Official Pytorch Implementation for the paper 'SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients'☆17Updated 2 years ago
- ☆40Updated last year
- Code for paper "SWALP: Stochastic Weight Averaging forLow-Precision Training".☆62Updated 5 years ago
- This repository is no longer maintained. Check☆82Updated 4 years ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- A collection of optimizers, some arcane others well known, for Flax.☆29Updated 3 years ago
- Exploiting Uncertainty of Loss Landscape for Stochastic Optimization☆15Updated 5 years ago
- Code for the paper "Query-Key Normalization for Transformers"☆35Updated 3 years ago
- Codes for DATA: Differentiable ArchiTecture Approximation.☆11Updated 3 years ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆15Updated 3 years ago
- LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)☆18Updated last year
- ☆18Updated 5 months ago
- ☆45Updated 4 months ago