Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"
☆14May 26, 2025Updated 10 months ago
Alternatives and similar repositories for positional_attention
Users that are interested in positional_attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated 10 months ago
- [KDD 2023] code for "Test accuracy vs. generalization gap: model selection in NLP without accessing training or testing data" https://arx…☆12Oct 17, 2022Updated 3 years ago
- ☆18Nov 10, 2024Updated last year
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11Oct 10, 2025Updated 5 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Autonomous AI-powered security scanning platform — CLI scanner, web dashboard, and one-command Docker deployment☆32Updated this week
- The official repository for AdaMuon☆35Aug 27, 2025Updated 6 months ago
- Official repository for the paper "Automating Continual Learning"☆18Jun 11, 2025Updated 9 months ago
- RS-IMLE☆44Dec 7, 2024Updated last year
- slowly building a set of infinite riddle generators for data-hungry methods☆14Nov 15, 2022Updated 3 years ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41May 24, 2024Updated last year
- ☆27Jul 9, 2024Updated last year
- Language Segment-Anything (with updated dependencies)☆34Mar 4, 2024Updated 2 years ago
- Sequence algorithms for use in Flashlight.☆14Jan 12, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementation of Neurips 2023 Paper "Multi Time Scale World Models"☆16Nov 8, 2024Updated last year
- logWMSE, an audio quality metric with support for digital silence target. Useful for evaluating audio source separation systems, even whe…☆37Jun 24, 2025Updated 9 months ago
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 8 months ago
- manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices☆12Jan 12, 2021Updated 5 years ago
- [NeurIPS 2021] code for "Taxonomizing local versus global structure in neural network loss landscapes" https://arxiv.org/abs/2107.11228☆20Jan 7, 2022Updated 4 years ago
- Reversible programming in Agda☆13Jun 22, 2023Updated 2 years ago
- [NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models☆33Jun 9, 2025Updated 9 months ago
- Tidy autoregressive inference in JAX☆15Sep 1, 2025Updated 6 months ago
- Official source code for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models☆12Dec 5, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ICML 2025]"Graph World Model", Tao Feng, Yexin Wu, Guanyu Lin, Jiaxuan You☆33Sep 20, 2025Updated 6 months ago
- ☆11Mar 4, 2020Updated 6 years ago
- This is the official code for the paper "Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable".☆29Mar 11, 2025Updated last year
- A Lightweight Deep Learning Library in MATLAB☆11Jun 28, 2019Updated 6 years ago
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Oct 11, 2024Updated last year
- Repository for Sparse Universal Transformers☆20Oct 23, 2023Updated 2 years ago
- A few models converted from caffe to CoreMLs format.☆15Jun 6, 2017Updated 8 years ago
- ☆12Sep 18, 2024Updated last year
- ☆14Jun 6, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 3 years ago
- A script to transcribe audio files with Google Cloud Speech API.☆10Oct 31, 2017Updated 8 years ago
- orbital MCMC☆10Jun 17, 2021Updated 4 years ago
- Implementation of NIPS2023: Unleashing the Full Potential of Product Quantization for Large-Scale Image Retrieva☆11Nov 12, 2024Updated last year
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- Optimal distance lower bound k-mer sampling.☆12Jun 19, 2024Updated last year
- Open source code for ICML 2025 Paper: Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias☆43Nov 14, 2025Updated 4 months ago