paulilioaica / Differential-Transformer
☆14Updated 6 months ago
Alternatives and similar repositories for Differential-Transformer:
Users that are interested in Differential-Transformer are comparing it to the libraries listed below
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆28Updated last year
- PyTorch implementation of Pseudo-Riemannian Graph Convolutional Networks (NeurIPS'22))☆16Updated 9 months ago
- [ICLR'24] Official PyTorch Implementation of ContraLSP☆29Updated last year
- ☆33Updated 9 months ago
- PyTorch implementation of "Kernel Neural Optimal Transport" (ICLR 2023)☆25Updated last year
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆43Updated last year
- Self Supervised Learning for Time Series Using Similarity Distillation☆10Updated 2 years ago
- Code for the paper "Disentangled Generative Models for Robust Prediction of System Dynamics"☆14Updated last year
- PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularl…☆25Updated last year
- Implementation of Implicit Graphon Neural Representation☆12Updated last year
- Code for "Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance", NeurIPS 2022.☆17Updated 2 years ago
- C-Mixup for NeurIPS 2022☆70Updated last year
- "Graph Convolutions Enrich the Self-Attention in Transformers!" NeurIPS 2024☆20Updated last month
- Entropic Optimal Transport Benchmark (NeurIPS 2023).☆23Updated last year
- Official source code for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models☆10Updated 4 months ago
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆40Updated last year
- This repository holds the code for the paper "Deep Conditional Gaussian Mixture Model forConstrained Clustering".☆33Updated 3 years ago
- [ICML'24] Official PyTorch Implementation of TimeX++☆23Updated 5 months ago
- Graph Transformers for Large Graphs☆21Updated 11 months ago
- ☆37Updated 8 months ago
- Official implementation of ICLR 2024 paper "Contrastive Learning Is Spectral Clustering On Similarity Graph" (https://arxiv.org/abs/2303.…☆18Updated 7 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆51Updated 10 months ago
- Spectral Graph Attention Network with Fast Eigen-approximation☆11Updated 3 years ago
- [NeurIPS 2024] The repository for experiment codes for the paper: Scaling Law for Time Series Forecasting.☆17Updated 6 months ago
- Bayesian Attention Modules☆35Updated 4 years ago
- Integrated gradients attribution method implemented in PyTorch☆26Updated 4 years ago
- [NeurIPS 2024] Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling☆22Updated 6 months ago
- Implementation of our ICLR2023 paper "Spherical-Sliced Wasserstein"☆13Updated last year
- Official repository for Cell Attention Networks☆14Updated last year
- Official Code for the paper: "Composite Feature Selection using Deep Ensembles"☆22Updated 2 years ago