borjanG / 2023-transformersLinks
Codes for the paper The emergence of clusters in self-attention dynamics.
☆15Updated last year
Alternatives and similar repositories for 2023-transformers
Users that are interested in 2023-transformers are comparing it to the libraries listed below
Sorting:
- Code for Accelerated Linearized Laplace Approximation for Bayesian Deep Learning (ELLA, NeurIPS 22')☆16Updated 2 years ago
- ☆13Updated 3 years ago
- Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561☆25Updated 4 years ago
- Blog post☆17Updated last year
- Laplace Redux -- Effortless Bayesian Deep Learning☆42Updated 2 years ago
- General Invertible Transformations for Flow-based Generative Models☆18Updated 4 years ago
- Euclidean Wasserstein-2 optimal transportation☆47Updated last year
- Transformers with doubly stochastic attention☆45Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆81Updated last year
- ☆53Updated 10 months ago
- A minimal implementation of a VAE with BinConcrete (relaxed Bernoulli) latent distribution in TensorFlow.☆22Updated 5 years ago
- ☆18Updated 2 years ago
- ☆22Updated 2 years ago
- ☆26Updated 2 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆37Updated 3 years ago
- ☆16Updated 8 months ago
- PyTorch implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆37Updated 3 years ago
- Monotone operator equilibrium networks☆52Updated 4 years ago
- Repo to accompany paper "Implicit Self-Regularization in Deep Neural Networks..."☆44Updated 6 years ago
- Code for "Training Deep Energy-Based Models with f-Divergence Minimization" ICML 2020☆36Updated 2 years ago
- Investigate the speed of adaptation of structural causal models☆15Updated 4 years ago
- Code for experiments on transformers using Markovian data.☆15Updated 6 months ago
- ModelDiff: A Framework for Comparing Learning Algorithms☆56Updated last year
- ☆15Updated 2 years ago
- Repo to the paper "Lie Point Symmetry Data Augmentation for Neural PDE Solvers"☆50Updated 2 years ago
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆36Updated last year
- Implementation of Action Matching for the Schrödinger equation☆24Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆58Updated last year
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated 2 years ago