lucidrains / discrete-key-value-bottleneck-pytorchView external linksLinks
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Jul 9, 2023Updated 2 years ago
Alternatives and similar repositories for discrete-key-value-bottleneck-pytorch
Users that are interested in discrete-key-value-bottleneck-pytorch are comparing it to the libraries listed below
Sorting:
- Implementation of a holodeck, written in Pytorch☆18Nov 1, 2023Updated 2 years ago
- Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…☆51May 10, 2022Updated 3 years ago
- A Transformer made of Rotation-equivariant Attention using Vector Neurons☆101Aug 1, 2023Updated 2 years ago
- Local Attention - Flax module for Jax☆22May 26, 2021Updated 4 years ago
- Un-*** 50 billions multimodality dataset☆23Sep 14, 2022Updated 3 years ago
- JAX implementation ViT-VQGAN☆82Sep 21, 2022Updated 3 years ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Oct 22, 2023Updated 2 years ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆87Nov 1, 2025Updated 3 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆179Sep 12, 2024Updated last year
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Aug 3, 2021Updated 4 years ago
- My explorations into editing the knowledge and memories of an attention network☆35Dec 8, 2022Updated 3 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆123Oct 17, 2024Updated last year
- Implementation of Differentiable Sign-Distance Function Rendering - in Pytorch☆70May 9, 2022Updated 3 years ago
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆641Jul 17, 2023Updated 2 years ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Dec 22, 2024Updated last year
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆35Jun 26, 2021Updated 4 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- [Unofficial] Kakaotrans: Kakao translate API for python☆16Mar 29, 2020Updated 5 years ago
- A simple implementation of a deep linear Pytorch module☆21Oct 16, 2020Updated 5 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆90Oct 11, 2024Updated last year
- Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR …☆55Sep 7, 2022Updated 3 years ago
- [HCLT 2022] Korean sentence text similarity dataset using naver shopping review☆25Oct 20, 2022Updated 3 years ago
- OSLO: Open Source for Large-scale Optimization☆175Sep 9, 2023Updated 2 years ago
- Implementation of Block Recurrent Transformer - Pytorch☆224Aug 20, 2024Updated last year
- Code for EMNLP-IJCNLP 2019 MRQA Workshop Paper: "Domain-agnostic Question-Answering with Adversarial Training"☆40Jul 25, 2024Updated last year
- PyTorch implementation for all methods and environments in the paper "MIMEx: Intrinsic Rewards from Masked Input Modeling"☆16May 17, 2023Updated 2 years ago
- ☆13Mar 2, 2025Updated 11 months ago
- ☆12Nov 25, 2018Updated 7 years ago
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆47Jul 16, 2023Updated 2 years ago
- ☆10Sep 7, 2022Updated 3 years ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆32Jul 28, 2023Updated 2 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30May 31, 2022Updated 3 years ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆104Oct 10, 2023Updated 2 years ago
- Explorations into the recently proposed Taylor Series Linear Attention☆100Aug 18, 2024Updated last year
- Recursive Leasting Squares (RLS) with Neural Network for fast learning☆59Nov 16, 2023Updated 2 years ago
- Implementation of a U-net complete with efficient attention as well as the latest research findings☆292May 3, 2024Updated last year
- An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up proc…☆196Mar 27, 2021Updated 4 years ago
- Implementation of a Light Recurrent Unit in Pytorch☆49Oct 6, 2024Updated last year