kyegomez / DifferentialTransformer
An open source community implementation of the model from "DIFFERENTIAL TRANSFORMER" paper by Microsoft.
☆22Updated this week
Alternatives and similar repositories for DifferentialTransformer:
Users that are interested in DifferentialTransformer are comparing it to the libraries listed below
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆53Updated 3 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆51Updated 2 weeks ago
- ☆198Updated last week
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆115Updated 2 weeks ago
- A pytorch implementation of Fourier Analysis Networks (FAN)☆30Updated 4 months ago
- Transformer model based on Kolmogorov–Arnold Network(KAN), which is an alternative of Multi-Layer Perceptron(MLP)☆27Updated 2 months ago
- ☆126Updated 9 months ago
- ☆40Updated 4 months ago
- Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports☆85Updated 8 months ago
- [AAAI 2025] Official Implementation of "Auto-Regressive Moving Diffusion Models for Time Series Forecasting"☆47Updated last week
- RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks☆91Updated 6 months ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆60Updated last month
- ☆45Updated 8 months ago
- TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting☆169Updated 6 months ago
- Minimal Mamba-2 implementation in PyTorch☆170Updated 7 months ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆185Updated 2 weeks ago
- An implementation of mLSTM and sLSTM in PyTorch.☆27Updated 8 months ago
- Official repository for CVPR24 Precognition Workshop Paper: VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotem…☆118Updated 10 months ago
- Simba☆200Updated 10 months ago
- 🕹️The toy examples of Kolmogorov-Arnold Network (Get Started Quickly)☆76Updated 9 months ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆76Updated 11 months ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆78Updated 2 weeks ago
- Code for "Is Mamba Effective for Time Series Forecasting?"☆240Updated last month
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture☆129Updated 9 months ago
- Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need.☆40Updated 5 months ago
- Official implement for "PGN: The RNN’s New Successor is Effective for Long-Range Time Series Forecasting"(NeurIPS 2024) in PyTorch.☆69Updated last month
- The official implementation of the paper: "SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series Forecasting…☆146Updated 3 months ago
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆73Updated 9 months ago
- This repository contains the codes to replicate the simulations from the paper: "Wav-KAN: Wavelet Kolmogorov-Arnold Networks". It showca…☆123Updated last month
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆44Updated 2 months ago