bojone / rnnLinks
一些RNN的实现
☆50Updated 2 years ago
Alternatives and similar repositories for rnn
Users that are interested in rnn are comparing it to the libraries listed below
Sorting:
- Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)☆33Updated 3 years ago
- ☆27Updated 11 months ago
- Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)☆84Updated 3 years ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆65Updated last year
- ICLR2023 - Tailoring Language Generation Models under Total Variation Distance☆21Updated 2 years ago
- A Tight-fisted Optimizer☆48Updated 2 years ago
- FLASHQuad_pytorch☆67Updated 3 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Updated 2 years ago
- 基于Transformer的单模型、多尺度的VAE模型☆56Updated 3 years ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆60Updated last year
- Implement the paper "Self-Attention with Relative Position Representations"☆133Updated 4 years ago
- code for Explicit Sparse Transformer☆62Updated last year
- A pytorch &keras implementation and demo of Fastformer.☆189Updated 2 years ago
- [EMNLP'19] Summary for Transformer Understanding☆53Updated 5 years ago
- A Tight-fisted Optimizer (Tiger), implemented in PyTorch.☆12Updated last year
- ☆51Updated 2 years ago
- 逻辑回归和单层softmax的解析解☆12Updated 3 years ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆55Updated 2 months ago
- Python下shuffle几百G文件☆33Updated 3 years ago
- ☆64Updated 9 months ago
- ☆33Updated 4 years ago
- ☆83Updated 5 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆64Updated 3 years ago
- FairSeq repo with Apollo optimizer☆114Updated last year
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆28Updated 5 years ago
- Lion and Adam optimization comparison☆61Updated 2 years ago
- ☆23Updated 2 years ago
- ☆35Updated 3 years ago
- Implementation of the paper "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting", https://arxi…☆19Updated 3 years ago
- Source code for our AAAI'22 paper 《From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression》☆24Updated 3 years ago