Official code for the NeurIPS25 paper "RAT: Bridging RNN Efficiencyand Attention Accuracy in Language Modeling" (https://arxiv.org/abs/2507.04416))
☆23Dec 10, 2025Updated 3 months ago
Alternatives and similar repositories for RAT
Users that are interested in RAT are comparing it to the libraries listed below
Sorting:
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- ☆21Dec 5, 2022Updated 3 years ago
- Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (M…☆31Nov 20, 2024Updated last year
- ☆13Feb 7, 2023Updated 3 years ago
- ☆16May 14, 2024Updated last year
- Implementation of Cascaded Head-colliding Attention (ACL'2021)☆11Sep 16, 2021Updated 4 years ago
- Implementation and experiments for Partially Supervised NER via Expected Entity Ratio in TACL 2022☆14Nov 7, 2022Updated 3 years ago
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆45Jan 6, 2026Updated 2 months ago
- ☆15Mar 22, 2023Updated 2 years ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 6 months ago
- Code for the paper Task Agnostic Morphology Evolution.☆20May 25, 2021Updated 4 years ago
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"☆26Jun 3, 2025Updated 9 months ago
- code for "EMS: 3D Eyebrow Modeling from Single-view Images"(SIGGRAPH Asia 2023)☆13May 3, 2025Updated 10 months ago
- C++ implementation with Python bindings of analytic forward and inverse kinematics for the Universal Robots.☆14Jan 5, 2026Updated 2 months ago
- Pytorch implementation of Graph-to-Graph Transformer for Transition-based Dependency Parsing accepted to EMNLP 2020☆22Nov 28, 2022Updated 3 years ago
- UNLP 2025 Shared Task on Detecting Social Media Manipulation☆23Aug 4, 2025Updated 7 months ago
- Ukrainian ELECTRA model☆12Mar 11, 2023Updated 3 years ago
- ☆21May 25, 2024Updated last year
- A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.☆25Oct 22, 2023Updated 2 years ago
- ☆110Feb 19, 2026Updated last month
- Linux distribution for space-grade robotics on the BeagleV-Fire RISC-V platform + FPGA support☆21Dec 24, 2025Updated 2 months ago
- FlexiTokens☆18Dec 27, 2025Updated 2 months ago
- ☆22Apr 13, 2018Updated 7 years ago
- The PyTorch implementation of paper "KERMIT: Knowledge Graph Completion of Enhanced Relation Modeling with Inverse Transformation"☆15Jul 4, 2025Updated 8 months ago
- Block-Recurrent Dynamics in ViTs 🦖☆33Dec 24, 2025Updated 2 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 2 months ago
- zero shot NER fine tuning☆14Mar 17, 2025Updated last year
- ☆31Feb 24, 2026Updated 3 weeks ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆56Mar 12, 2026Updated last week
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆47Feb 26, 2026Updated 3 weeks ago
- ☆14Mar 22, 2024Updated last year
- Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"☆32Jun 20, 2023Updated 2 years ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆64Oct 3, 2025Updated 5 months ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.☆29Feb 25, 2021Updated 5 years ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- Zero-shot entity linking with less data☆15Aug 1, 2022Updated 3 years ago
- Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.☆30Nov 5, 2021Updated 4 years ago