A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
☆43Nov 8, 2020Updated 5 years ago
Alternatives and similar repositories for AoA-pytorch
Users that are interested in AoA-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆68Jan 10, 2023Updated 3 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆126Nov 13, 2020Updated 5 years ago
- Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)☆23Jun 26, 2020Updated 5 years ago
- A simple Transformer where the softmax has been replaced with normalization☆20Sep 11, 2020Updated 5 years ago
- Re-implementation for 'R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering'.☆12Mar 13, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [CVPR 2026] Elucidating the SNR-t Bias of Diffusion Probabilistic Models☆120Apr 20, 2026Updated last month
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30May 31, 2022Updated 3 years ago
- Includes additional materials for the following keras.io blog post.☆12Jun 23, 2021Updated 4 years ago
- ☆37Jan 20, 2023Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 4 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- A variant of Transformer-XL where the memory is updated not with a queue, but with attention☆49Jul 31, 2020Updated 5 years ago
- My explorations into editing the knowledge and memories of an attention network☆35Dec 8, 2022Updated 3 years ago
- Implementation of Tranception, an attention network, paired with retrieval, that is SOTA for protein fitness prediction☆32Jun 19, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"☆99Jan 13, 2021Updated 5 years ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Jan 21, 2025Updated last year
- My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other h…☆54Jul 2, 2023Updated 2 years ago
- Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, …☆39Aug 3, 2021Updated 4 years ago
- Minimal implementation of Denoised Smoothing (https://arxiv.org/abs/2003.01908) in TensorFlow.☆20Aug 4, 2021Updated 4 years ago
- This repository provides the dataset introduced by our WSSTG paper☆13Jul 21, 2019Updated 6 years ago
- Implementation of Denoising Diffusion for protein design, but using the new Equiformer (successor to SE3 Transformers) with some addition…☆57Dec 27, 2022Updated 3 years ago
- Implementation of Differentiable Sign-Distance Function Rendering - in Pytorch☆70May 9, 2022Updated 4 years ago
- Implementation of the Remixer Block from the Remixer paper, in Pytorch☆36Sep 27, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Oct 8, 2023Updated 2 years ago
- Learning Long-term Visual Dynamics with Region Proposal Interaction Networks (ICLR 2021)☆113May 29, 2022Updated 4 years ago
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch☆54Mar 30, 2021Updated 5 years ago
- Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning☆166Feb 12, 2024Updated 2 years ago
- Implementation of Kronecker Attention in Pytorch☆20Sep 12, 2020Updated 5 years ago
- A GPT, made only of MLPs, in Jax☆59Jun 23, 2021Updated 4 years ago
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆59Mar 19, 2021Updated 5 years ago
- Implementation of Metaformer, but in an autoregressive manner☆26Jun 21, 2022Updated 3 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆122Oct 17, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Implementation of a holodeck, written in Pytorch☆19Nov 1, 2023Updated 2 years ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆92Dec 22, 2023Updated 2 years ago
- Another attempt at a long-context / efficient transformer by me☆38Apr 11, 2022Updated 4 years ago
- Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch☆25Jan 6, 2021Updated 5 years ago
- ☆15Oct 27, 2020Updated 5 years ago
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆37Sep 23, 2024Updated last year
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆47Jul 16, 2023Updated 2 years ago