[ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation
☆50Mar 1, 2025Updated last year
Alternatives and similar repositories for UnifiedImplicitAttnRepr
Users that are interested in UnifiedImplicitAttnRepr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆232Oct 16, 2025Updated 7 months ago
- ☆13Jul 11, 2025Updated 10 months ago
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆21Jul 29, 2024Updated last year
- ☆20Dec 24, 2024Updated last year
- ☆16Jul 10, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆11Apr 23, 2023Updated 3 years ago
- ☆39Apr 5, 2024Updated 2 years ago
- LAGr: Label Aligned Graphs for Better Systematic Generalization in Semantic Parsing☆10Jun 1, 2022Updated 3 years ago
- An Attention Superoptimizer☆22Jan 20, 2025Updated last year
- A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…☆49Oct 21, 2025Updated 7 months ago
- Visualize neural networks using TikZ in Julia☆15Jan 29, 2025Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …☆16Sep 18, 2025Updated 8 months ago
- [AAAI24] Learning Invariant Inter-pixel Correlations for Superpixel Generation☆14Mar 27, 2024Updated 2 years ago
- ☆60Jul 9, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆17Feb 23, 2025Updated last year
- The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"☆10Mar 22, 2023Updated 3 years ago
- Expanding linear RNN state-transition matrix eigenvalues to include negatives improves state-tracking tasks and language modeling without…☆21Mar 15, 2025Updated last year
- ☆16May 23, 2025Updated last year
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆62Sep 3, 2025Updated 8 months ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆20Oct 11, 2024Updated last year
- A fork of the PEFT library, supporting Robust Adaptation (RoSA)☆15Aug 16, 2024Updated last year
- A PyTorch implementation of SIN.☆12Oct 20, 2021Updated 4 years ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆29Nov 3, 2025Updated 6 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its…☆21Sep 10, 2024Updated last year
- ☆30Oct 20, 2021Updated 4 years ago
- ☆52Jan 28, 2024Updated 2 years ago
- ☆24Oct 29, 2024Updated last year
- Implementation of the paper End-to-end Learning of Deterministic Decision Trees☆17May 19, 2022Updated 4 years ago
- Official codebase for NeurIPS 2022 paper End-to-end Learning to Index and Search in Large Output Spaces☆12Apr 19, 2023Updated 3 years ago
- Source code of ACL 2023 accepted paper "AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression"☆13Jun 14, 2023Updated 2 years ago
- Repository for the ICML 2021 paper: https://arxiv.org/abs/2103.04886☆13Jan 24, 2022Updated 4 years ago
- [NeurIPS 2020] "FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training" by Yonggan Fu, Ha…☆10Feb 13, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Apr 9, 2025Updated last year
- Official repository for the paper: "Trees with Attention for Set Prediction Tasks" (ICML21)☆10Jan 19, 2022Updated 4 years ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated 2 years ago
- ☆11Aug 27, 2024Updated last year
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆129May 12, 2026Updated 2 weeks ago
- Factor Graph Grammars in Python☆13Jan 17, 2026Updated 4 months ago
- Official PyTorch Implementation☆17Dec 3, 2022Updated 3 years ago