Official code for the NeurIPS25 paper "RAT: Bridging RNN Efficiencyand Attention Accuracy in Language Modeling" (https://arxiv.org/abs/2507.04416))
☆23Dec 10, 2025Updated 2 months ago
Alternatives and similar repositories for RAT
Users that are interested in RAT are comparing it to the libraries listed below
Sorting:
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆30Dec 8, 2025Updated 2 months ago
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- Implementation of Cascaded Head-colliding Attention (ACL'2021)☆11Sep 16, 2021Updated 4 years ago
- ☆21Dec 5, 2022Updated 3 years ago
- ☆13Feb 7, 2023Updated 3 years ago
- ☆16May 14, 2024Updated last year
- ☆15Mar 22, 2023Updated 2 years ago
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆45Jan 6, 2026Updated last month
- The official code of "Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers"☆19Jul 24, 2024Updated last year
- A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.☆25Oct 22, 2023Updated 2 years ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 5 months ago
- UNLP 2025 Shared Task on Detecting Social Media Manipulation☆23Aug 4, 2025Updated 6 months ago
- Pytorch implementation of Graph-to-Graph Transformer for Transition-based Dependency Parsing accepted to EMNLP 2020☆22Nov 28, 2022Updated 3 years ago
- ☆22Apr 13, 2018Updated 7 years ago
- Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"☆32Jun 20, 2023Updated 2 years ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 2 months ago
- Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.☆29Feb 25, 2021Updated 5 years ago
- Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.☆30Nov 5, 2021Updated 4 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- ☆87Updated this week
- A tool for managing Save games in EscapeTheBackrooms.☆20Updated this week
- A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…☆15Dec 3, 2025Updated 2 months ago
- ☆10May 26, 2025Updated 9 months ago
- Truncate datetime objects to the specifiec level of precision, inspired by PostgreSQL's DATE_TRUNC.☆14Apr 20, 2021Updated 4 years ago
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆54Jan 12, 2026Updated last month
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- ☆10Oct 2, 2024Updated last year
- Eternium CSS Framework☆13Dec 26, 2025Updated 2 months ago
- The PyTorch implementation of paper "KERMIT: Knowledge Graph Completion of Enhanced Relation Modeling with Inverse Transformation"☆15Jul 4, 2025Updated 7 months ago
- Repository containing the UI and anonymized participant data for "Do Users Write More Insecure Code with AI Assistants?"☆12Apr 11, 2024Updated last year
- ☆20May 24, 2025Updated 9 months ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- ☆20Sep 11, 2025Updated 5 months ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 4 months ago
- carbon.now.sh python module☆11Oct 10, 2021Updated 4 years ago
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- auto-generating summaries of interactive and dynamic geovisualization☆11Dec 9, 2024Updated last year