Pytorch implementation of Compressive Transformers, from Deepmind
☆163Oct 4, 2021Updated 4 years ago
Alternatives and similar repositories for compressive-transformer-pytorch
Users that are interested in compressive-transformer-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain☆34Oct 30, 2020Updated 5 years ago
- Axial Positional Embedding for Pytorch☆84Feb 25, 2025Updated last year
- Graph neural network message passing reframed as a Transformer with local attention☆70Dec 24, 2022Updated 3 years ago
- V-MPO torch version with DMLab30 and GTrXL☆13Mar 1, 2021Updated 5 years ago
- sigma-MoE layer☆21Jan 5, 2024Updated 2 years ago
- Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention☆269Aug 10, 2021Updated 4 years ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆46Mar 3, 2021Updated 5 years ago
- Implementation of the algorithm detailed in paper "Evolutionary design of molecules based on deep learning and a genetic algorithm"☆24Dec 15, 2023Updated 2 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆67Jan 10, 2023Updated 3 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- Long Range Arena for Benchmarking Efficient Transformers☆786Dec 16, 2023Updated 2 years ago
- An easy PyTorch implementation of "Stabilizing Transformers for Reinforcement Learning"☆183Feb 21, 2023Updated 3 years ago
- Simple implementation of V-MPO proposed in https://arxiv.org/abs/1909.12238☆48Nov 10, 2020Updated 5 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- ☆24Nov 22, 2022Updated 3 years ago
- ☆15Aug 9, 2021Updated 4 years ago
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆88Jul 9, 2023Updated 2 years ago
- Implementation of a Transformer, but completely in Triton☆279Apr 5, 2022Updated 3 years ago
- Gated Transformer Model for Computer Vision☆25Jul 11, 2021Updated 4 years ago
- ☆17Dec 21, 2020Updated 5 years ago
- An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up proc…☆196Mar 27, 2021Updated 4 years ago
- Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch☆430Aug 14, 2021Updated 4 years ago
- Pytorch library for fast transformer implementations☆1,765Mar 23, 2023Updated 3 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30May 31, 2022Updated 3 years ago
- FastFormers - highly efficient transformer models for NLU☆709Mar 21, 2025Updated last year
- ☆14Dec 9, 2021Updated 4 years ago
- Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch☆879Oct 30, 2023Updated 2 years ago
- ☆389Oct 18, 2023Updated 2 years ago
- Trains Transformer model variants. Data isn't shuffled between batches.☆143Oct 5, 2022Updated 3 years ago
- Python Research Framework☆107Nov 3, 2022Updated 3 years ago
- ☆68Aug 29, 2024Updated last year
- ☆65Nov 4, 2021Updated 4 years ago
- A PyTorch Deep Learning Kit☆12Apr 30, 2023Updated 2 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- ☆3,695Sep 21, 2022Updated 3 years ago
- An open source implementation of CLIP.☆33Nov 7, 2022Updated 3 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Evaluating long-term memory of reinforcement learning algorithms☆165Jun 23, 2023Updated 2 years ago