bojone / rnnLinks
一些RNN的实现
☆51Updated 2 years ago
Alternatives and similar repositories for rnn
Users that are interested in rnn are comparing it to the libraries listed below
Sorting:
- Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)☆84Updated 3 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Updated 2 years ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆66Updated last year
- A pytorch &keras implementation and demo of Fastformer.☆191Updated 3 years ago
- ☆29Updated last year
- FLASHQuad_pytorch☆68Updated 3 years ago
- This is a code repository for the ACL 2022 paper "ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generati…☆35Updated 3 years ago
- Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)☆37Updated 4 years ago
- ICLR2023 - Tailoring Language Generation Models under Total Variation Distance☆21Updated 2 years ago
- 基于Transformer的单模型、多尺度的VAE模型☆57Updated 4 years ago
- [ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)☆126Updated last year
- A Tight-fisted Optimizer☆50Updated 2 years ago
- Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"☆371Updated 2 years ago
- A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆119Updated 4 years ago
- ☆84Updated 6 years ago
- ☆49Updated 5 months ago
- ☆14Updated 2 years ago
- [EMNLP'19] Summary for Transformer Understanding☆53Updated 6 years ago
- ☆52Updated 2 years ago
- 逻辑回归和单层softmax的解析解☆12Updated 4 years ago
- [ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling☆81Updated last year
- Sparse Attention with Linear Units☆19Updated 4 years ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆64Updated 2 years ago
- Code for ICML 2020 paper: Do RNN and LSTM have Long Memory?☆17Updated 4 years ago
- ☆33Updated 4 years ago
- Python下shuffle几百G文件☆33Updated 4 years ago
- ☆67Updated last year
- Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719☆22Updated last year
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆11Updated 2 years ago
- ☆27Updated 6 years ago