borjanG / 2023-transformers
Codes for the paper The emergence of clusters in self-attention dynamics.
☆15Updated last year
Alternatives and similar repositories for 2023-transformers
Users that are interested in 2023-transformers are comparing it to the libraries listed below
Sorting:
- Code for Accelerated Linearized Laplace Approximation for Bayesian Deep Learning (ELLA, NeurIPS 22')☆16Updated 2 years ago
- Transformers with doubly stochastic attention☆45Updated 2 years ago
- Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561☆25Updated 4 years ago
- ☆53Updated 7 months ago
- ☆25Updated 2 years ago
- ☆22Updated 2 years ago
- Repo to the paper "Lie Point Symmetry Data Augmentation for Neural PDE Solvers"☆50Updated last year
- ☆34Updated 2 years ago
- Quantification of Uncertainty with Adversarial Models☆28Updated last year
- u-MPS implementation and experimentation code used in the paper Tensor Networks for Probabilistic Sequence Modeling (https://arxiv.org/ab…☆19Updated 4 years ago
- Model hub for all your DiffeqML needs. Pretrained weights, modules, and basic inference infrastructure☆25Updated 2 years ago
- Efficient Riemannian Optimization on Stiefel Manifold via Cayley Transform☆40Updated 6 years ago
- Implementation for our paper "How Far Can Transformers Reason? The Locality Barrier and Inductive Scratchpad"☆12Updated 11 months ago
- ☆37Updated 3 years ago
- General Invertible Transformations for Flow-based Generative Models☆18Updated 4 years ago
- Euclidean Wasserstein-2 optimal transportation☆47Updated last year
- code for "Neural Conservation Laws A Divergence-Free Perspective".☆38Updated 2 years ago
- Monotone operator equilibrium networks☆52Updated 4 years ago
- ☆67Updated 5 months ago
- Experiments from the paper "On Second Order Behaviour in Augmented Neural ODEs"☆58Updated 7 months ago
- ☆53Updated 9 months ago
- Laplace Redux -- Effortless Bayesian Deep Learning☆42Updated 2 years ago
- ☆32Updated 7 months ago
- Pytorch (PyG) and Tensorflow (Keras/Spektral) implementation of Total Variation Graph Neural Network (TVGNN), as presented at ICML 2023.☆20Updated 2 months ago
- Pytorch implementation for "Particle Flow Bayes' Rule"☆14Updated 5 years ago
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆60Updated last year
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆18Updated 2 years ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆62Updated 4 years ago
- Blog post☆17Updated last year