RobinWu218 / ToSTLinks
[ICLR 2025 Spotlight] Official Implementation for ToST (Token Statistics Transformer)
☆130Updated 11 months ago
Alternatives and similar repositories for ToST
Users that are interested in ToST are comparing it to the libraries listed below
Sorting:
- [ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization☆108Updated 8 months ago
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆39Updated 10 months ago
- Awesome list of papers that extend Mamba to various applications.☆138Updated 7 months ago
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆137Updated last month
- [NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)☆445Updated last week
- A repository for DenseSSMs☆88Updated last year
- A More Fair and Comprehensive Comparison between KAN and MLP☆178Updated last year
- The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink…