yizhangzzz / transformers-lego
☆18Updated last year
Related projects ⓘ
Alternatives and complementary repositories for transformers-lego
- ☆18Updated last month
- ☆34Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated last year
- Transformers with doubly stochastic attention☆40Updated 2 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated last year
- Euclidean Wasserstein-2 optimal transportation☆44Updated last year
- Code for Accelerated Linearized Laplace Approximation for Bayesian Deep Learning (ELLA, NeurIPS 22')☆16Updated 2 years ago
- ☆17Updated 2 years ago
- Experiments for Meta-Learning Symmetries by Reparameterization☆56Updated 3 years ago
- Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation☆68Updated 2 years ago
- ☆59Updated 2 years ago
- PyTorch implementation for "Probabilistic Circuits for Variational Inference in Discrete Graphical Models", NeurIPS 2020☆15Updated 3 years ago
- Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight https://openreview.net/forum?id=XJk19XzGq2J☆65Updated 6 months ago
- ☆21Updated 2 years ago
- An adaptive training algorithm for residual network☆14Updated 4 years ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆34Updated last year
- ☆19Updated 4 years ago
- SGD with large step sizes learns sparse features [ICML 2023]☆32Updated last year
- Simple Scalable Discrete Diffusion for text in PyTorch☆27Updated last month
- [NeurIPS'19] Deep Equilibrium Models Jax Implementation☆37Updated 4 years ago
- Code for Unbiased Implicit Variational Inference (UIVI)☆13Updated 5 years ago
- ☆21Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 5 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆61Updated 2 years ago
- Code for minimum-entropy coupling.☆29Updated 4 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆60Updated 2 years ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆43Updated last year
- An empirical investigation of deep learning theory☆16Updated 5 years ago
- ☆49Updated 4 years ago
- Quantification of Uncertainty with Adversarial Models☆27Updated last year