kyegomez / DifferentialTransformer
An open source community implementation of the model from "DIFFERENTIAL TRANSFORMER" paper by Microsoft.
☆24Updated this week
Alternatives and similar repositories for DifferentialTransformer:
Users that are interested in DifferentialTransformer are comparing it to the libraries listed below
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆58Updated 5 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆52Updated 3 weeks ago
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆119Updated 3 weeks ago
- ☆230Updated last month
- ☆128Updated 11 months ago
- Simba☆206Updated last year
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆192Updated 2 weeks ago
- Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports☆86Updated 10 months ago
- Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need.☆43Updated 2 months ago
- Convolutional layer for Kolmogorov-Arnold Network (KAN)☆98Updated last month
- Minimal Mamba-2 implementation in PyTorch☆188Updated 10 months ago
- This repository contains the codes to replicate the simulations from the paper: "Wav-KAN: Wavelet Kolmogorov-Arnold Networks". It showca…☆142Updated 2 months ago
- The official Pytorch implementation of the paper "Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT …☆37Updated last year
- State Space Models☆69Updated 11 months ago
- ☆48Updated last year
- A pytorch implementation of Fourier Analysis Networks (FAN)☆34Updated 6 months ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆99Updated 3 weeks ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆65Updated 4 months ago
- ☆48Updated last month
- ☆50Updated 10 months ago
- An implementation of mLSTM and sLSTM in PyTorch.☆26Updated 10 months ago
- Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis☆229Updated 2 months ago
- tinybig for deep function learning☆60Updated 4 months ago
- This is the official code of our TMLR 2025 Paper "DeformTime: Capturing Variable Dependencies with Deformable Attention for Time Series F…☆26Updated 3 weeks ago
- Kolmogorov–Arnold Networks with modified activation (using MLP to represent the activation)☆103Updated 5 months ago
- ☆57Updated 2 months ago
- Official repository for CVPR24 Precognition Workshop Paper: VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotem…☆126Updated last year
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆218Updated 10 months ago
- [AAAI 2025] Official Implementation of "Auto-Regressive Moving Diffusion Models for Time Series Forecasting"☆72Updated 2 months ago
- RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks☆100Updated 8 months ago