LUMIA-Group / FourierTransformer
The official Pytorch implementation of the paper "Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator" (ACL 2023 Findings)
☆30Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for FourierTransformer
- HGRN2: Gated Linear RNNs with State Expansion☆49Updated 3 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆33Updated last month
- A repository for DenseSSMs☆88Updated 7 months ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆50Updated 2 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆37Updated 7 months ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆61Updated 7 months ago
- State Space Models☆63Updated 6 months ago
- ☆98Updated 8 months ago
- ☆41Updated 7 months ago
- [ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)☆115Updated 8 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆50Updated 2 weeks ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆94Updated 7 months ago
- Mixture of Attention Heads☆39Updated 2 years ago
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆44Updated last year
- ☆17Updated last year
- ☆21Updated last year
- ☆24Updated 5 months ago
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆76Updated 10 months ago
- [EMNLP 2023 Main] Sparse Low-rank Adaptation of Pre-trained Language Models☆70Updated 8 months ago
- Awesome Learn From Model Beyond Fine-Tuning: A Survey☆50Updated this week
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19Updated last week
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning☆81Updated last week
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆103Updated 3 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆202Updated 5 months ago
- Implementation of Agent Attention in Pytorch☆86Updated 4 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆33Updated last month
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆34Updated 5 months ago