YihongDong / FAN
☆52Updated this week
Related projects ⓘ
Alternatives and complementary repositories for FAN
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆49Updated this week
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆30Updated 2 weeks ago
- RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks☆75Updated 2 months ago
- State Space Models☆62Updated 6 months ago
- ☆118Updated 6 months ago
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆103Updated this week
- A repository for DenseSSMs☆88Updated 7 months ago
- Minimal Mamba-2 implementation in PyTorch☆129Updated 4 months ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆47Updated 2 months ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆54Updated this week
- ☆41Updated 7 months ago
- Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports☆74Updated 5 months ago
- ☆25Updated last month
- ☆38Updated 5 months ago
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆41Updated 11 months ago
- This is the official PyTorch implementation of the paper: Diffusion Auto-regressive Transformer for Effective Self-supervised Time Series…☆26Updated this week
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆31Updated last month
- Simba☆182Updated 7 months ago
- Transformer model based on Kolmogorov–Arnold Network(KAN), which is an alternative of Multi-Layer Perceptron(MLP)☆24Updated 3 weeks ago
- Pytorch implementation of the xLSTM model by Beck et al. (2024)☆137Updated 3 months ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆167Updated last week
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆102Updated 3 months ago
- Official repository for CVPR24 Precognition Workshop Paper: VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotem…☆100Updated 7 months ago
- Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch☆18Updated last week
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆66Updated 3 weeks ago
- Source code for Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers☆13Updated 5 months ago
- Implementation of Agent Attention in Pytorch☆86Updated 4 months ago
- ☆25Updated 4 months ago
- Kolmogorov-Arnold Networks (KAN) using Jacobi polynomials instead of B-splines.☆32Updated 6 months ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆61Updated 6 months ago