[ICML 2021 Oral] We show pure attention suffers rank collapse, and how different mechanisms combat it.
☆173Mar 8, 2021Updated 5 years ago
Alternatives and similar repositories for attention-rank-collapse
Users that are interested in attention-rank-collapse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts☆15Feb 26, 2024Updated 2 years ago
- ☆13Feb 16, 2021Updated 5 years ago
- Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms☆20Nov 29, 2021Updated 4 years ago
- Code for the TCS paper "On the performance of learned data structures" and the ICML paper "Why are learned indexes so effective?"☆21May 9, 2021Updated 5 years ago
- Code Release for the 2023 NeurIPS Paper How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained langua…☆17Dec 6, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Efficient Householder Transformation in PyTorch☆69Jul 6, 2021Updated 4 years ago