lucidrains / agent-attention-pytorchView external linksLinks
Implementation of Agent Attention in Pytorch
☆93Jul 10, 2024Updated last year
Alternatives and similar repositories for agent-attention-pytorch
Users that are interested in agent-attention-pytorch are comparing it to the libraries listed below
Sorting:
- Explorations into the recently proposed Taylor Series Linear Attention☆100Aug 18, 2024Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Oct 22, 2023Updated 2 years ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆91Dec 22, 2023Updated 2 years ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆179Sep 12, 2024Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Mar 29, 2022Updated 3 years ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆46May 23, 2023Updated 2 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆47Jul 16, 2023Updated 2 years ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆135Oct 15, 2025Updated 4 months ago
- My explorations into editing the knowledge and memories of an attention network☆35Dec 8, 2022Updated 3 years ago
- Implementation of a Light Recurrent Unit in Pytorch☆49Oct 6, 2024Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Dec 22, 2024Updated last year
- Toy genetic algorithm in Pytorch☆55Apr 29, 2025Updated 9 months ago
- Implementation of the Llama architecture with RLHF + Q-learning☆170Feb 1, 2025Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆804Jan 30, 2026Updated 2 weeks ago
- Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"☆88Oct 13, 2023Updated 2 years ago
- Implementation of Dreamcraft3D, 3D content generation in Pytorch☆81Oct 29, 2023Updated 2 years ago
- Implementation of Infini-Transformer in Pytorch☆112Jan 4, 2025Updated last year
- Implementation of the convolutional module from the Conformer paper, for use in Transformers☆433May 17, 2023Updated 2 years ago
- Local Attention - Flax module for Jax☆22May 26, 2021Updated 4 years ago
- Implementation of Metaformer, but in an autoregressive manner☆26Jun 21, 2022Updated 3 years ago
- Implementation of Chroma, generative models of protein using DDPM and GNNs, in Pytorch☆160Dec 27, 2022Updated 3 years ago
- Implementation of Strassen attention, from Kozachinskiy et al. of National Center of AI in Chile☆41Jul 8, 2025Updated 7 months ago
- Fine-tune copilot based on your codebase☆12Mar 26, 2024Updated last year
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- Implementation of Autoregressive Diffusion in Pytorch☆432Dec 4, 2025Updated 2 months ago
- Sound field reconstruction using neural processes with dynamic kernels☆15Mar 25, 2025Updated 10 months ago
- Sequence-based prediction of peptide-TCR interactions using paired chain data☆13Feb 2, 2026Updated 2 weeks ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆15Mar 11, 2024Updated last year
- AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧☆10Aug 30, 2024Updated last year
- Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, …☆39Aug 3, 2021Updated 4 years ago
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago
- Deep Learning tools For Biology☆10Apr 18, 2022Updated 3 years ago
- A repository comprising of code for generation of noisy speech data from clean data using deep learning methods☆16Jul 12, 2021Updated 4 years ago
- Speaker adaptive forced alignment (phonetic segmentation) using Wav2Vec2☆22Jan 13, 2026Updated last month
- PegasusX: The Future of Multimodal Embeddings 🦄 🦄☆14Oct 16, 2024Updated last year
- The Marketing Swarm Template is a powerful, easy-to-use framework built on top of Swarms for creating multi-platform marketing content us…☆20Oct 6, 2025Updated 4 months ago