bojone / rnn
一些RNN的实现
☆47Updated last year
Related projects: ⓘ
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆60Updated 4 months ago
- A Tight-fisted Optimizer☆46Updated last year
- 基于Transformer的单模型、多尺度的VAE模型☆53Updated 3 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆95Updated last year
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆48Updated last week
- FLASHQuad_pytorch☆66Updated 2 years ago
- code for Explicit Sparse Transformer☆57Updated last year
- ICLR2023 - Tailoring Language Generation Models under Total Variation Distance☆21Updated last year
- [ICLR 2022] Code for paper "Exploring Extreme Parameter Compression for Pre-trained Language Models"(https://arxiv.org/abs/2205.10036)☆19Updated last year
- ☆24Updated 2 months ago
- 逻辑回归和单层softmax的解析解☆12Updated 3 years ago
- A pytorch &keras implementation and demo of Fastformer.☆184Updated 2 years ago
- Python下shuffle几百G文件☆33Updated 3 years ago
- 记录Transformer升级的论文笔记☆17Updated last year
- ☆83Updated 4 years ago
- Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)☆83Updated 2 years ago
- [ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)☆111Updated 6 months ago
- This is a code repository for the ACL 2022 paper "ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generati…☆28Updated last year
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆27Updated 4 years ago
- Learning to Encode Position for Transformer with Continuous Dynamical Model☆59Updated 4 years ago
- PyTorch implementation of Pay Attention to MLPs☆39Updated 3 years ago
- A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆106Updated 3 years ago
- [ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling☆70Updated 4 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆33Updated 2 months ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆53Updated last year
- ☆48Updated last year
- ☆32Updated 3 years ago
- ☆42Updated last week
- ☆31Updated 2 years ago
- Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".☆43Updated 2 years ago