shwinshaker / LipGrow
An adaptive training algorithm for residual network
☆14Updated 4 years ago
Related projects: ⓘ
- ☆19Updated 4 years ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆27Updated last year
- Implementation of the models and datasets used in "An Information-theoretic Approach to Distribution Shifts"☆24Updated 2 years ago
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆25Updated 2 years ago
- A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification f…☆44Updated 4 years ago
- [ICLR 2021] "Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, S…☆22Updated 2 years ago
- ☆36Updated 3 years ago
- Pytorch implementation for "The Surprising Positive Knowledge Transfer in Continual 3D Object Shape Reconstruction"☆33Updated 2 years ago
- [ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context …☆20Updated 6 months ago
- ☆12Updated 4 years ago
- ICML 2020, Estimating Generalization under Distribution Shifts via Domain-Invariant Representations☆21Updated 4 years ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆15Updated 10 months ago
- [JMLR] TRADES + random smoothing for certifiable robustness☆14Updated 4 years ago
- ☆25Updated 4 years ago
- Code for Reparameterizable Subset Sampling via Continuous Relaxations, IJCAI 2019.☆49Updated 11 months ago
- ☆17Updated last year
- Code base for SRSGD.☆28Updated 4 years ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆43Updated last year
- Low-variance and unbiased gradient for backpropagation through categorical random variables, with application in variational auto-encoder…☆17Updated 4 years ago
- ☆13Updated last year
- Official PyTorch code release for Implicit Gradient Transport, NeurIPS'19☆21Updated 5 years ago
- Self-Distillation with weighted ground-truth targets; ResNet and Kernel Ridge Regression☆17Updated 2 years ago
- Reproducible code for Augmentation paper☆18Updated 5 years ago
- Anytime Learning At Macroscale☆9Updated 2 years ago
- ☆29Updated 2 years ago
- Tensorflow implementation of "Meta Dropout: Learning to Perturb Latent Features for Generalization" (ICLR 2020)☆26Updated 4 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- Interpolation between Residual and Non-Residual Networks, ICML 2020. https://arxiv.org/abs/2006.05749☆26Updated 4 years ago
- ☆32Updated 11 months ago
- ☆40Updated last year