shwinshaker / LipGrow
An adaptive training algorithm for residual network
☆15Updated 4 years ago
Alternatives and similar repositories for LipGrow:
Users that are interested in LipGrow are comparing it to the libraries listed below
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆27Updated 3 years ago
- ☆13Updated 2 years ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- Source code for paper Conservative Uncertainty Estimation By Fitting Prior Networks (ICLR 2020)☆21Updated 2 years ago
- Code base for SRSGD.☆28Updated 5 years ago
- A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification f…☆45Updated 5 years ago
- ☆22Updated last year
- ☆41Updated 2 years ago
- ☆17Updated 2 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆44Updated 2 years ago
- ☆36Updated 4 years ago
- [ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context …☆22Updated last year
- Implementation of the models and datasets used in "An Information-theoretic Approach to Distribution Shifts"☆25Updated 3 years ago
- ☆20Updated 4 years ago
- Code associated with our paper "Learning Group Structure and Disentangled Representations of Dynamical Environments"☆15Updated 2 years ago
- Self-Distillation with weighted ground-truth targets; ResNet and Kernel Ridge Regression☆17Updated 3 years ago
- ☆29Updated 3 years ago
- Reproducible code for Augmentation paper☆17Updated 6 years ago
- This repository hosts the dataset and source code for "A causal view of compositional zero-shot recognition". Yuval Atzmon, Felix Kreuk, …☆27Updated 3 years ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆29Updated 2 years ago
- Interpolation between Residual and Non-Residual Networks, ICML 2020. https://arxiv.org/abs/2006.05749☆26Updated 4 years ago
- Implementation of Kronecker Attention in Pytorch☆18Updated 4 years ago
- Estimating Gradients for Discrete Random Variables by Sampling without Replacement☆40Updated 5 years ago
- Paper and Code for "Curriculum Learning by Optimizing Learning Dynamics" (AISTATS 2021)☆19Updated 3 years ago
- Open source code for paper "On the Learning and Learnability of Quasimetrics".☆32Updated 2 years ago
- [ICML'21] Improved Contrastive Divergence Training of Energy Based Models☆62Updated 2 years ago
- Gradient-based Hyperparameter Optimization Over Long Horizons☆13Updated 3 years ago
- Code for ICLR 2022 Paper, "Controlling Directions Orthogonal to a Classifier"☆35Updated last year
- Code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning", Ren et al., NeurIPS'20☆25Updated 4 years ago