lucidrains / mlm-pytorchView external linksLinks
An implementation of masked language modeling for Pytorch, made as concise and simple as possible
☆181Aug 9, 2023Updated 2 years ago
Alternatives and similar repositories for mlm-pytorch
Users that are interested in mlm-pytorch are comparing it to the libraries listed below
Sorting:
- A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch☆235Jun 12, 2023Updated 2 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Pr…☆26Jun 27, 2022Updated 3 years ago
- Axial Positional Embedding for Pytorch☆84Feb 25, 2025Updated 11 months ago
- GPT, but made only out of MLPs☆89May 25, 2021Updated 4 years ago
- Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-Equivariant Graph Neural Network☆226Jun 2, 2024Updated last year
- ☆11Jul 5, 2020Updated 5 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆137Aug 2, 2023Updated 2 years ago
- A concise but complete full-attention transformer with a set of promising experimental features from various papers☆5,800Feb 7, 2026Updated last week
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax☆113Sep 8, 2021Updated 4 years ago
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Aug 3, 2021Updated 4 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆120Aug 4, 2021Updated 4 years ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆57Jan 5, 2023Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 3 years ago
- Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch☆879Oct 30, 2023Updated 2 years ago
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch☆54Mar 30, 2021Updated 4 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆67Jan 10, 2023Updated 3 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20May 14, 2022Updated 3 years ago
- To be a next-generation DL-based phenotype prediction from genome mutations.☆19May 17, 2021Updated 4 years ago
- Implementation of Mixout with PyTorch☆75Dec 21, 2022Updated 3 years ago
- Cold Start Similar Artists Ranking with Gravity-Inspired Graph Autoencoders (RecSys 2021)☆20Oct 17, 2021Updated 4 years ago
- Implementation of a holodeck, written in Pytorch☆18Nov 1, 2023Updated 2 years ago
- 3rd Place solution for Feedback Prize - Predicting Effective Arguments Kaggle competition☆16Sep 6, 2022Updated 3 years ago
- ☆54Jan 18, 2023Updated 3 years ago
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆827May 5, 2024Updated last year
- End-to-end Text-to-Speech with Generative Adversarial Networks☆20Feb 6, 2021Updated 5 years ago
- ai4code competition source code☆19Aug 12, 2022Updated 3 years ago
- ☆221Jun 8, 2020Updated 5 years ago
- For the code release of our arXiv paper "Revisiting Few-sample BERT Fine-tuning" (https://arxiv.org/abs/2006.05987).☆185Jun 12, 2023Updated 2 years ago
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆331Jan 10, 2024Updated 2 years ago
- ☆21Mar 15, 2023Updated 2 years ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆13Jan 1, 2025Updated last year
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. This specific repository is geared towards integration wit…☆325Aug 28, 2025Updated 5 months ago
- Graph neural network message passing reframed as a Transformer with local attention☆70Dec 24, 2022Updated 3 years ago
- ☆16Apr 14, 2025Updated 10 months ago
- Combines the SSL Method MixMatch with a pre-trained model (EfficientNet) on a chest x-ray dataset.☆11Jun 22, 2019Updated 6 years ago
- ☆19Feb 5, 2026Updated last week