yashbonde / RNN-sim
Running massive simulations using RNNs on CPUs for building bots and all kinds of things.
☆13Updated 3 years ago
Alternatives and similar repositories for RNN-sim:
Users that are interested in RNN-sim are comparing it to the libraries listed below
- ☆43Updated 4 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆101Updated 4 years ago
- ☆47Updated 4 years ago
- Pytorch library for factorized L0-based pruning.☆44Updated last year
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆32Updated 3 years ago
- This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"☆28Updated 2 years ago
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated 2 years ago
- PhD thesis (updating) of Jiatao Gu from HKU☆19Updated 6 years ago
- A library for data streaming and augmentation☆20Updated 11 months ago
- ☆28Updated 2 years ago
- A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning☆15Updated 3 years ago
- Pretraining summarization models using a corpus of nonsense☆13Updated 3 years ago
- GPT, but made only out of MLPs☆88Updated 3 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Updated 3 years ago
- ☆22Updated 3 years ago
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆59Updated last year
- Video Games Dataset for Multi-Document Summarization☆16Updated 2 weeks ago
- ☆13Updated 2 years ago
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆58Updated 3 years ago
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆67Updated 2 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆30Updated 3 years ago
- Source code for the paper "Multilingual Neural Machine Translation with Soft Decoupled Encoding"☆29Updated 3 years ago
- Combining encoder-based language models☆11Updated 3 years ago
- PyTorch Implementation of NeurIPS 2020 paper "Learning Sparse Prototypes for Text Generation"☆22Updated 3 years ago
- c++ mosestokenizer☆17Updated 11 months ago
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆48Updated 3 years ago
- A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…☆47Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated 2 years ago