paulilioaica / Differential-Transformer
☆16Updated 7 months ago
Alternatives and similar repositories for Differential-Transformer
Users that are interested in Differential-Transformer are comparing it to the libraries listed below
Sorting:
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆29Updated last year
- BackTime: Backdoor Attacks on Multivariate Time Series Forecasting☆21Updated last month
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆45Updated last year
- [NeurIPS '24] Code repo for the paper entitled "Learning Structured Representations with Hyperbolic Embeddings" at NeurIPS 2024☆12Updated 3 months ago
- PyTorch implementation of Pseudo-Riemannian Graph Convolutional Networks (NeurIPS'22))☆16Updated 10 months ago
- "Graph Convolutions Enrich the Self-Attention in Transformers!" NeurIPS 2024☆21Updated last month
- [ICML'24] Official PyTorch Implementation of TimeX++☆25Updated 6 months ago
- Official code for "CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis", ICML 2023☆34Updated last year
- [NeurIPS 2024] Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling☆22Updated 7 months ago
- Official code for ICLR 2023 paper "ContraNorm: A Contrastive Learning Perspective on Oversmoothing and Beyond "☆35Updated 2 years ago
- A code for the NeurIPS 2022 Table Representation Learning Workshop paper: "Diffusion models for missing value imputation in tabular data"☆49Updated 10 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆53Updated last month
- [NeurIPS 2023, Spotlight] Rank-N-Contrast: Learning Continuous Representations for Regression☆113Updated last year
- [ICLR'24] Official PyTorch Implementation of ContraLSP☆30Updated last year
- ☆33Updated 9 months ago
- CAT-Walk is an inducive method that learns hyperedge representations via a novel higher-order random walk, SetWalk.☆14Updated last year
- Time series explainability via self-supervised model behavior consistency☆48Updated last year
- [SDM24] Official code for "Time-Transformer"☆12Updated this week
- PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularl…☆25Updated last year
- ☆33Updated 4 months ago
- An offical implementation of EHRDiff [TMLR]☆25Updated 10 months ago
- Official Code for the paper: "Composite Feature Selection using Deep Ensembles"☆22Updated 2 years ago
- ☆14Updated 3 years ago
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆64Updated 6 months ago
- [NeurIPS 2024 Spotlight] Code for the paper "Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts"☆51Updated 6 months ago
- StableGNN-Generalizing Graph Neural Networks on Out-Of-Distribution Graphs☆22Updated last year
- Official Pytorch implementation of NeuralWalker☆31Updated 11 months ago
- Official Implementation of Paper "Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling" (ICML 2023)☆10Updated last year
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆53Updated 11 months ago
- [ICML 2025] Official implementation of "AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasti…☆17Updated this week