☆14Jul 26, 2023Updated 2 years ago
Alternatives and similar repositories for retnet
Users that are interested in retnet are comparing it to the libraries listed below
Sorting:
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆11Jul 25, 2023Updated 2 years ago
- PyTorch implementation of Retentive Network: A Successor to Transformer for Large Language Models☆14Jul 20, 2023Updated 2 years ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Jan 7, 2025Updated last year
- Official Code of Decoupled Graph Convolution (DGC)☆16Jan 31, 2026Updated last month
- A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…☆106Nov 24, 2023Updated 2 years ago
- ☆17Aug 1, 2023Updated 2 years ago
- Graph Transformers for Large Graphs☆22Apr 26, 2024Updated last year
- Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent,…☆226Mar 12, 2024Updated last year
- PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularl…☆25Jul 9, 2023Updated 2 years ago
- ☆22Jul 24, 2023Updated 2 years ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Feb 9, 2026Updated 3 weeks ago
- Official Code for the paper: "Composite Feature Selection using Deep Ensembles"☆24Mar 26, 2023Updated 2 years ago
- Official code for "Reward-Free Curricula for Training Robust World Models", ICLR 2024.☆28Jan 24, 2024Updated 2 years ago
- A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy Perspective☆36Jan 19, 2026Updated last month
- Addressing the problem of predicting crime occurrence based on historic records☆11Nov 27, 2019Updated 6 years ago
- Official code for ICLR 2023 paper "ContraNorm: A Contrastive Learning Perspective on Oversmoothing and Beyond "☆35Apr 24, 2023Updated 2 years ago
- ☆69Aug 3, 2023Updated 2 years ago
- Code for: "Neural Controlled Differential Equations for Online Prediction Tasks"☆41Oct 19, 2022Updated 3 years ago
- Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation☆34May 8, 2024Updated last year
- This is the official code for WWW 2021 paper "Session-aware Linear Item-Item Models for Session-based Recommendation"☆33Sep 19, 2023Updated 2 years ago
- ☆13Jun 18, 2025Updated 8 months ago
- Official code for ICLR 2022 paper: "PoNet: Pooling Network for Efficient Token Mixing in Long Sequences".☆33May 23, 2023Updated 2 years ago
- Official repository for the paper "Partition and Code: learning how to compress graphs" (NeurIPS'21) https://arxiv.org/abs/2107.01952☆36Oct 26, 2021Updated 4 years ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆54Jan 12, 2026Updated last month
- Official source codes of airsep☆39Mar 26, 2024Updated last year
- Official code for "STaSy: Score-based Tabular data Synthesis", ICLR 2023☆34Aug 11, 2023Updated 2 years ago
- Official code implementation for ICDE 23 paper MAMDR: A Model Agnostic Learning Method for Multi-Domain Recommendation☆38Nov 27, 2023Updated 2 years ago
- The official PyTorch implementation of "An Attentional Multi-scale Co-evolving Model for Dynamic Link Prediction" (TheWebConf'23)☆11May 4, 2023Updated 2 years ago
- Mitigating the Filter Bubble while Maintaining Relevance: Targeted Diversification with VAE-based Recommender Systems☆10Mar 15, 2023Updated 2 years ago
- Official code for AL-PINNS: Augmented Lagrangian relaxation method for Physics-Informed Neural Networks☆12Jul 29, 2023Updated 2 years ago
- This repository reproduces the results in the paper "How expressive are transformers in spectral domain for graphs?"(published in TMLR)☆12Jul 10, 2022Updated 3 years ago
- Code for AAAI21 paper "Scalable and Explainable 1-Bit Matrix Completion via Graph Signal Learning"☆11Feb 15, 2022Updated 4 years ago
- 免注册免费使用 ChatGPT,请关注微信公众号【胖竹同学】。☆10Apr 4, 2023Updated 2 years ago
- TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient Survey☆10Nov 7, 2024Updated last year
- Graphical intuition to MOSFET square-law☆11Jan 5, 2021Updated 5 years ago
- HyFormer: Hybrid Transformer and CNN For Pixel-level Multispectral Image Classification☆15Feb 15, 2023Updated 3 years ago
- ☆11Nov 27, 2020Updated 5 years ago
- ☆11Jan 7, 2025Updated last year