lucidrains / VN-transformerView external linksLinks
A Transformer made of Rotation-equivariant Attention using Vector Neurons
☆101Aug 1, 2023Updated 2 years ago
Alternatives and similar repositories for VN-transformer
Users that are interested in VN-transformer are comparing it to the libraries listed below
Sorting:
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆47Jul 16, 2023Updated 2 years ago
- Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-Equivariant Graph Neural Network☆226Jun 2, 2024Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆123Oct 17, 2024Updated last year
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆87Nov 1, 2025Updated 3 months ago
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆88Jul 9, 2023Updated 2 years ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆104Oct 10, 2023Updated 2 years ago
- Vector Neuron pointcloud networks for classification and segmentation. Separate training setups for VN-DGCNN and VN-PointNet☆32May 31, 2022Updated 3 years ago
- ☆31Mar 28, 2023Updated 2 years ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆344Apr 2, 2025Updated 10 months ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- Implementation of 2-simplicial attention proposed by Clift et al. (2019) and the recent attempt to make practical in Fast and Simplex, Ro…☆46Sep 2, 2025Updated 5 months ago
- Local Attention - Flax module for Jax☆22May 26, 2021Updated 4 years ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Oct 22, 2023Updated 2 years ago
- Correspondence-Free Point Cloud Registration with SO(3)-Equivariant Implicit Shape Representations☆19May 11, 2023Updated 2 years ago
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆804Jan 30, 2026Updated 2 weeks ago
- Axial Positional Embedding for Pytorch☆84Feb 25, 2025Updated 11 months ago
- JAX implementation ViT-VQGAN☆63Jul 23, 2022Updated 3 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Aug 20, 2024Updated last year
- Implementation of Strassen attention, from Kozachinskiy et al. of National Center of AI in Chile☆41Jul 8, 2025Updated 7 months ago
- Graph neural network message passing reframed as a Transformer with local attention☆70Dec 24, 2022Updated 3 years ago
- Fork of HyenaDNA, a long-range genomic foundation model built with Hyena☆10Aug 14, 2023Updated 2 years ago
- Active Learning of Abstract Plan Feasibility☆12Feb 10, 2023Updated 3 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30May 31, 2022Updated 3 years ago
- ☆25May 4, 2023Updated 2 years ago
- Implementation of Infini-Transformer in Pytorch☆112Jan 4, 2025Updated last year
- Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.☆16Nov 21, 2025Updated 2 months ago
- Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"☆372Feb 3, 2025Updated last year
- PyTorch implementation for the paper Equivariant Point Network for 3D Point Cloud Analysis (CVPR2021).☆113Feb 12, 2023Updated 3 years ago
- Scale Optimized Spline SLAM☆53Apr 7, 2022Updated 3 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆67Jan 10, 2023Updated 3 years ago
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch☆54Mar 30, 2021Updated 4 years ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆135Oct 15, 2025Updated 4 months ago
- Code for Equivariant Transporter Network☆23Apr 17, 2023Updated 2 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- GPT for FACodec☆13Mar 25, 2024Updated last year
- Multidimensional indexing for tensors☆137Jul 17, 2023Updated 2 years ago
- Implementation of Denoising Diffusion for protein design, but using the new Equiformer (successor to SE3 Transformers) with some addition…☆57Dec 27, 2022Updated 3 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- Implementation of a U-net complete with efficient attention as well as the latest research findings☆292May 3, 2024Updated last year