☆14Jul 26, 2023Updated 2 years ago
Alternatives and similar repositories for retnet
Users that are interested in retnet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆11Jul 25, 2023Updated 2 years ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent,…☆226Mar 12, 2024Updated 2 years ago
- A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…☆105Nov 24, 2023Updated 2 years ago
- KDD Cup 2022 Baidu Wind Power Forecast项目:百度风电功率预测赛 (Paddle Track 5th)☆13Jul 29, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆16Jan 7, 2025Updated last year
- ☆11Aug 15, 2023Updated 2 years ago
- Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation. Bingchen Zhao and Kai Han. (NeurIPS 2021)☆12Aug 20, 2023Updated 2 years ago
- ☆22Jul 24, 2023Updated 2 years ago
- A PyTorch implementation of MixNet: Mixed Depthwise Convolutional Kernels☆11Aug 5, 2019Updated 6 years ago
- PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularl…☆27Jul 9, 2023Updated 2 years ago
- ☆17Apr 10, 2024Updated 2 years ago
- Beyond Known Clusters: Probe New Prototypes for Efficient Generalized Class Discovery☆16Apr 28, 2024Updated 2 years ago
- Defending AI-Based Automatic Modulation Recognition Models Against Adversarial Attacks☆11Jan 11, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Open Sourced ML Research Paper Implementations in Tensorflow☆16Jan 8, 2022Updated 4 years ago
- Find context neurons in Pythia models.☆13Jun 13, 2023Updated 3 years ago
- ☆14Jan 17, 2024Updated 2 years ago
- RWKV6 in native pytorch and triton:)☆11Aug 4, 2024Updated last year
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21May 12, 2026Updated last month
- ☆10Feb 21, 2023Updated 3 years ago
- Graph Transformers for Large Graphs☆22Apr 26, 2024Updated 2 years ago
- [CVPR 2024] Targeted Representation Alignment for Open-World Semi-Supervised Learning☆14Sep 23, 2024Updated last year
- ☆19Oct 14, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 轻量化卷积神经网络实现(SqueezeNet/MobileNet/ShuffleNet/MnasNet)☆12Mar 5, 2026Updated 3 months ago
- ☆14Jan 22, 2025Updated last year
- High-performance tokenized language data-loader for Python C++ extension☆15Jul 22, 2024Updated last year
- ☆13Jan 19, 2024Updated 2 years ago
- [CVPR'24] Solving the Catastrophic Forgetting Problem in Generalized Category Discovery https://arxiv.org/pdf/2501.05272☆16Dec 24, 2024Updated last year
- code for Automatic Modulation Open Set Recognition with diffusion models☆19Jan 4, 2025Updated last year
- Perl implementation of the Naval Research Laboratory text-to-phoneme algorithm, described by Elovitz et al (1976)☆16May 7, 2020Updated 6 years ago
- Official Code of Decoupled Graph Convolution (DGC)☆16Jan 31, 2026Updated 4 months ago
- A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy Perspective☆39Jan 19, 2026Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…☆49Oct 21, 2025Updated 7 months ago
- ☆10Jun 10, 2023Updated 3 years ago
- Fork of HyenaDNA, a long-range genomic foundation model built with Hyena☆10Aug 14, 2023Updated 2 years ago
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12May 24, 2023Updated 3 years ago
- ☆34Jan 9, 2024Updated 2 years ago
- ☆10May 1, 2023Updated 3 years ago
- Official Code for the paper: "Composite Feature Selection using Deep Ensembles"☆25Mar 26, 2023Updated 3 years ago