Smerity/sha-rnn

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Smerity/sha-rnn)

Smerity / sha-rnn

Single Headed Attention RNN - "Stop thinking with your head"

☆1,180

Alternatives and similar repositories for sha-rnn

Users that are interested in sha-rnn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
zihangdai / xlnet
View on GitHub
XLNet: Generalized Autoregressive Pretraining for Language Understanding
☆6,180May 28, 2023Updated 3 years ago
asappresearch / sru
View on GitHub
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
☆2,107Jan 4, 2022Updated 4 years ago
harvardnlp / pytorch-struct
View on GitHub
Fast, general, and tested differentiable structured prediction in PyTorch
☆1,132Apr 20, 2022Updated 4 years ago
facebookresearch / XLM
View on GitHub
PyTorch original implementation of Cross-lingual Language Model Pretraining.
☆2,926Feb 14, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
allenai / allennlp
View on GitHub
An open-source NLP research library, built on PyTorch.
☆11,889Nov 22, 2022Updated 3 years ago
salesforce / awd-lstm-lm
View on GitHub
LSTM and QRNN Language Model Toolkit for PyTorch
☆1,990Feb 12, 2022Updated 4 years ago
facebookresearch / unlikelihood_training
View on GitHub
Neural Text Generation with Unlikelihood Training
☆311Aug 31, 2021Updated 4 years ago
Smerity / pytorch-lamb
View on GitHub
Implementation of https://arxiv.org/abs/1904.00962
☆15Aug 30, 2019Updated 6 years ago
google-research / electra
View on GitHub
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
☆2,368Mar 23, 2024Updated 2 years ago
LiyuanLucasLiu / RAdam
View on GitHub
On the Variance of the Adaptive Learning Rate and Beyond
☆2,547Jul 31, 2021Updated 4 years ago
yikangshen / Ordered-Neurons
View on GitHub
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"
☆580Aug 28, 2019Updated 6 years ago
namisan / mt-dnn
View on GitHub
Multi-Task Deep Neural Networks for Natural Language Understanding
☆2,260Mar 7, 2024Updated 2 years ago
brightmart / albert_zh
View on GitHub
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
☆3,979Nov 21, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
laiguokun / Funnel-Transformer
View on GitHub
☆220Jun 8, 2020Updated 6 years ago
google-research / text-to-text-transfer-transformer
View on GitHub
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
☆6,538Jul 8, 2026Updated last week
huggingface / naacl_transfer_learning_tutorial
View on GitHub
Repository of code for the tutorial on Transfer Learning in NLP held at NAACL 2019 in Minneapolis, MN, USA
☆723Oct 16, 2019Updated 6 years ago
facebookresearch / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆32,245Sep 30, 2025Updated 9 months ago
sebastianruder / NLP-progress
View on GitHub
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the mo…
☆22,957Jul 28, 2024Updated last year
salesforce / pytorch-qrnn
View on GitHub
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM
☆1,263Feb 12, 2022Updated 4 years ago
microsoft / MASS
View on GitHub
MASS: Masked Sequence to Sequence Pre-training for Language Generation
☆1,117Nov 28, 2022Updated 3 years ago
kimiyoung / transformer-xl
View on GitHub
☆3,707Sep 21, 2022Updated 3 years ago
openai / sparse_attention
View on GitHub
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
☆1,615Aug 12, 2020Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
thunlp / ERNIE
View on GitHub
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
☆1,419Jan 10, 2024Updated 2 years ago
huggingface / hmtl
View on GitHub
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
☆1,195Aug 1, 2023Updated 2 years ago
salesforce / ctrl
View on GitHub
Conditional Transformer Language Model for Controllable Generation
☆1,881May 1, 2025Updated last year
thunlp / PLMpapers
View on GitHub
Must-read Papers on pre-trained language models.
☆3,361Nov 6, 2022Updated 3 years ago
asyml / texar-pytorch
View on GitHub
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CAS…
☆747Apr 14, 2022Updated 4 years ago
locuslab / trellisnet
View on GitHub
[ICLR'19] Trellis Networks for Sequence Modeling
☆471Aug 20, 2019Updated 6 years ago
idiap / fast-transformers
View on GitHub
Pytorch library for fast transformer implementations
☆1,773Mar 23, 2023Updated 3 years ago
graykode / xlnet-Pytorch
View on GitHub
Simple XLNet implementation with Pytorch Wrapper
☆581Jul 3, 2019Updated 7 years ago
huawei-noah / Pretrained-Language-Model
View on GitHub
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
☆3,162Jan 22, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Luolc / AdaBound
View on GitHub
An optimizer that trains as fast as Adam and as good as SGD.
☆2,904Jul 23, 2023Updated 2 years ago
facebookresearch / pytext
View on GitHub
A natural language modeling framework based on PyTorch
☆6,296Oct 17, 2022Updated 3 years ago
PetrochukM / PyTorch-NLP
View on GitHub
Basic Utilities for PyTorch Natural Language Processing (NLP)
☆2,224Jul 4, 2023Updated 3 years ago
lucidrains / reformer-pytorch
View on GitHub
Reformer, the efficient Transformer, in Pytorch
☆2,191Jun 21, 2023Updated 3 years ago
huggingface / pytorch_block_sparse
View on GitHub
Fast Block Sparse Matrices for Pytorch
☆551Jan 21, 2021Updated 5 years ago
uber-research / PPLM
View on GitHub
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
☆1,153Feb 20, 2024Updated 2 years ago
marcotcr / checklist
View on GitHub
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
☆2,051Jan 9, 2024Updated 2 years ago