tonyduan / transformer-blocksLinks
Multi-Head Attention, Transformer, Perceiver, Linear Attention.
☆12Updated 2 years ago
Alternatives and similar repositories for transformer-blocks
Users that are interested in transformer-blocks are comparing it to the libraries listed below
Sorting:
- Relative gradient optimization of the Jacobian term in unsupervised deep learning, NeurIPS 2020☆21Updated 4 years ago
- A differentiation API for PyTorch☆30Updated 5 years ago
- Continuous-time gradient flow for generative modeling and variational inference☆33Updated 7 years ago
- Hierarchical variational models for physics.☆18Updated 5 years ago
- Code for "'Hey, that's not an ODE:' Faster ODE Adjoints via Seminorms" (ICML 2021)☆88Updated 3 years ago
- Jax-based MaxEnt☆17Updated 6 years ago
- Riemannian Convex Potential Maps☆67Updated 2 years ago
- "Parameter origami" -- folding and unfolding collections of parameters for optimization and sensitivity analysis.☆14Updated last year
- [NeurIPS 2020] Neural Manifold Ordinary Differential Equations (https://arxiv.org/abs/2006.10254)☆121Updated 2 years ago
- ☆15Updated 4 years ago
- Pytorch implementation of 'Semi-Implicit Methods for Deep Neural Networks'☆25Updated 6 years ago
- Pytorch implement of the paper Neural Canonical Transformation with Symplectic Flows☆30Updated 5 years ago
- A public repository for our paper, Rao-Blackwellized Stochastic Gradients for Discrete Distributions☆22Updated 6 years ago
- ☆48Updated 2 years ago
- ☆22Updated 5 years ago
- A Generic Tensor-Network library that is designed for quantum simulation, base on the pytorch☆60Updated 6 years ago
- ☆22Updated 3 years ago
- MintNet: Building Invertible Neural Networks with Masked Convolutions☆39Updated 4 years ago
- Experiments with Neural ODEs and Adversarial Attacks☆44Updated 6 years ago
- Python code for our paper "Adversarial Domain Adaptation for Identifying Phase Transitions"☆20Updated 7 years ago
- ☆16Updated 5 months ago
- Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561☆25Updated 4 years ago
- Code to minimize the Variational Contrastive Divergence (VCD)☆29Updated 6 years ago
- Canonical normalizing flows☆10Updated 6 years ago
- Library for normalizing flows and neural flows.☆25Updated 3 years ago
- Efficient Householder Transformation in PyTorch☆66Updated 4 years ago
- Autoregressive Energy Machines☆78Updated 3 years ago
- Code for Understanding and Mitigating Exploding Inverses in Invertible Neural Networks (AISTATS 2021) http://arxiv.org/abs/2006.09347☆30Updated 5 years ago
- ☆59Updated 6 years ago
- Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling☆36Updated 4 years ago