Implementation of Nyström Self-attention, from the paper Nyströmformer
☆145Mar 24, 2025Updated last year
Alternatives and similar repositories for nystrom-attention
Users that are interested in nystrom-attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- ☆389Oct 18, 2023Updated 2 years ago
- Axial Positional Embedding for Pytorch☆84Feb 25, 2025Updated last year
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch☆54Mar 30, 2021Updated 4 years ago
- Implementation of the algorithm detailed in paper "Evolutionary design of molecules based on deep learning and a genetic algorithm"☆24Dec 15, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆804Jan 30, 2026Updated last month
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆389Jul 18, 2023Updated 2 years ago
- AdvMIL: Adversarial Multiple Instance Learning for the Survival Analysis on Whole-Slide Images (Medical Image Analysis 2024)☆41Nov 4, 2023Updated 2 years ago
- Graph neural network message passing reframed as a Transformer with local attention☆70Dec 24, 2022Updated 3 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆67Jan 10, 2023Updated 3 years ago
- Implementation of Tranception, an attention network, paired with retrieval, that is SOTA for protein fitness prediction☆32Jun 19, 2022Updated 3 years ago
- Usable implementation of Mogrifier, a circuit for enhancing LSTMs and potentially other networks, from Deepmind☆22Jun 9, 2024Updated last year
- An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up proc…☆196Mar 27, 2021Updated 4 years ago
- Pytorch implementation of Compressive Transformers, from Deepmind☆163Oct 4, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆58Jan 13, 2021Updated 5 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Another attempt at a long-context / efficient transformer by me☆38Apr 11, 2022Updated 3 years ago
- Pytorch reimplementation of Molecule Attention Transformer, which uses a transformer to tackle the graph-like structure of molecules☆58Dec 2, 2020Updated 5 years ago
- ☆24Aug 25, 2022Updated 3 years ago
- Implementation of a Transformer, but completely in Triton☆279Apr 5, 2022Updated 3 years ago
- Implementation of a U-net complete with efficient attention as well as the latest research findings☆292May 3, 2024Updated last year
- Gene mutation and pathway activity prediction from H&E slides☆35Feb 4, 2022Updated 4 years ago
- Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning☆166Feb 12, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification☆134Apr 1, 2021Updated 4 years ago
- Attention Based Deep 3D Multiple Instance Learning☆25Aug 1, 2020Updated 5 years ago
- ICML 2020, Estimating Generalization under Distribution Shifts via Domain-Invariant Representations☆23Jun 30, 2020Updated 5 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification☆470May 3, 2024Updated last year
- Predicting Ovarian Cancer Treatment Response in Histopathology using Hierarchical Vision Transformers and Multiple Instance Learning☆12Nov 29, 2023Updated 2 years ago
- ☆352Mar 29, 2025Updated 11 months ago
- Implementation of Flash Attention in Jax☆227Mar 1, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- machine-readable information for the TCGA dataset, mirrored from other sources☆21Jun 14, 2022Updated 3 years ago
- Implementation of Linformer for Pytorch☆305Jan 5, 2024Updated 2 years ago
- Histopathology Feature Extractors (2024)☆14Jun 14, 2024Updated last year
- Yet Another Diffusion Automation☆13Aug 21, 2022Updated 3 years ago
- A Tight-fisted Optimizer (Tiger), implemented in PyTorch.☆12Jun 26, 2024Updated last year
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆253Sep 1, 2022Updated 3 years ago
- DAR introduces the diagonal scanning order for next-token prediction and proposes a direction-aware autoregressive transformer framework.☆18Apr 16, 2025Updated 11 months ago