kyegomez / DifferentialTransformerLinks

An open source community implementation of the model from "DIFFERENTIAL TRANSFORMER" paper by Microsoft.

☆37

Alternatives and similar repositories for DifferentialTransformer

Users that are interested in DifferentialTransformer are comparing it to the libraries listed below

Sorting:

nanowell / Differential-Transformer-PyTorch
PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …
☆83Updated last year
tommyip / mamba2-minimal
Minimal Mamba-2 implementation in PyTorch
☆236Updated last year
YihongDong / FAN
☆252Updated last month
akaashdash / kansformers
☆140Updated last year
kyegomez / SwitchTransformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…
☆134Updated last month
badripatro / simba
Simba
☆215Updated last year
howard-hou / RWKV-TS
RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks
☆120Updated last year
muditbhargava66 / PyxLSTM
Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.
☆302Updated last year
kyegomez / xLSTM
Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"
☆119Updated last month
kyegomez / Griffin
Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆56Updated last month
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆211Updated last month
cheng-haha / KANs
🕹️The toy examples of Kolmogorov-Arnold Network (Get Started Quickly)
☆75Updated last year
jwzhanggy / tinyBIG
tinybig for deep function learning
☆61Updated 6 months ago
XiongxiaoXu / SST
The code of the CIKM'25 paper: "SST: Multi-Scale Hybrid Mamba-Transformer Experts for Time Series Forecasting"
☆200Updated 3 weeks ago
miniHuiHui / awesome-high-order-neural-network
☆54Updated last year
Event-AHU / Mamba_State_Space_Model_Paper_List
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
☆745Updated 5 months ago
Leopold2333 / Bi-Mamba4TS
☆80Updated 9 months ago
XiudingCai / Awesome-Mamba-Collection
A curated collection of papers, tutorials, videos, and other valuable resources related to Mamba.
☆673Updated 3 months ago
pengzhangzhi / Awesome-Mamba
Awesome list of papers that extend Mamba to various applications.
☆139Updated 5 months ago
radarFudan / Awesome-state-space-models
Collection of papers on state-space models
☆609Updated last month
lunaaa95 / mou
Semantics-Aware Patch Encoding and Hierarchical Dependency Modeling for Long-Term Time Series Forecasting
☆45Updated 4 months ago
Atik-Ahamed / TimeMachine
TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting
☆201Updated last year
myscience / x-lstm
Pytorch implementation of the xLSTM model by Beck et al. (2024)
☆179Updated last year
smvorwerk / xlstm-cuda
Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports
☆90Updated last year
yyyujintang / VMRNN-PyTorch
Official repository for CVPR24 Precognition Workshop Paper: VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotem…
☆154Updated last year
wzhwzhwzh0921 / S-D-Mamba
Code for "Is Mamba Effective for Time Series Forecasting?"
☆352Updated 6 months ago
jlamprou / Fourier-Analysis-Networks-FAN
A pytorch implementation of Fourier Analysis Networks (FAN)
☆37Updated last year
badripatro / mamba360
State Space Models
☆71Updated last year
hkproj / mamba-notes
Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
☆175Updated last year
lgy112112 / ikan
ikan: many kan variants for every body
☆292Updated 4 months ago