Code for the paper Don't Pay Attention
☆54Sep 25, 2025Updated 5 months ago
Alternatives and similar repositories for avey-dpa
Users that are interested in avey-dpa are comparing it to the libraries listed below
Sorting:
- AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)☆23Oct 15, 2024Updated last year
- ☆45Apr 30, 2018Updated 7 years ago
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- Ultra-minimal autoregressive diffusion model for image generation☆21Dec 26, 2025Updated 2 months ago
- Semantic alignment of astronomical data with natural language using multi-modal models. (Jax) Code associated with https://arxiv.org/abs/…☆17Oct 18, 2024Updated last year
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 10 months ago
- Fluid Language Model Benchmarking☆26Sep 16, 2025Updated 5 months ago
- ☆20Apr 17, 2023Updated 2 years ago
- ☆19Dec 4, 2025Updated 3 months ago
- The accompanying code for "Simplifying and Understanding State Space Models with Diagonal Linear RNNs" (Ankit Gupta, Harsh Mehta, Jonatha…☆23Dec 30, 2022Updated 3 years ago
- qwen3 experiments☆34Jul 1, 2025Updated 8 months ago
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆58Nov 11, 2025Updated 3 months ago
- AI-free static security scanner for Claude Code artifacts (Skills, Hooks, MCP configs). Detects data exfiltration, prompt injection, and …☆17Updated this week
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆98Dec 5, 2024Updated last year
- ☆22Nov 9, 2024Updated last year
- ☆29Jul 9, 2024Updated last year
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆71Jan 13, 2026Updated last month
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆62Sep 3, 2025Updated 6 months ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆67Apr 24, 2024Updated last year
- Code for "ReSpace: Text-Driven 3D Indoor Scene Synthesis and Editing with Preference Alignment"☆61Dec 9, 2025Updated 3 months ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆35Oct 28, 2025Updated 4 months ago
- Plugin QGIS☆10Jan 16, 2023Updated 3 years ago
- Official code for ICLR 2023 paper "ContraNorm: A Contrastive Learning Perspective on Oversmoothing and Beyond "☆35Apr 24, 2023Updated 2 years ago
- ☆35Apr 12, 2024Updated last year
- Declarative SkiaSharp drawings - eg SVG or XAML☆31Sep 19, 2024Updated last year
- documentation used in my projects☆16Mar 2, 2026Updated last week
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 2 months ago
- manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices☆12Jan 12, 2021Updated 5 years ago
- Simple & Scalable Pretraining for Neural Architecture Research☆309Dec 6, 2025Updated 3 months ago
- JAX/Flax implementation of the Hyena Hierarchy☆34Apr 27, 2023Updated 2 years ago
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆129Jun 24, 2025Updated 8 months ago
- Example to read qr code with kotlin☆10Jul 24, 2018Updated 7 years ago
- This is a frontend to the Inkscape command line feature to allow the user to perform batch conversions of SVG files.☆15Dec 10, 2013Updated 12 years ago
- Bugtracker of novel-ebook.com☆12Aug 11, 2021Updated 4 years ago
- Card Payments Simulation Tool For Indie Devs : Core Card Switch Engine, Fraud Engine, ATM/POS GUI Simulator , Admin Dash (Real-time MSG …☆19Jun 15, 2025Updated 8 months ago
- Horizontal Pod Autoscaling for .NET applications☆10May 23, 2019Updated 6 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆93Jan 25, 2024Updated 2 years ago
- A neural network layer API and library for sequence modeling, designed for easy creation of sequence models that can be executed layerwis…☆56Feb 20, 2026Updated 2 weeks ago
- ☆35Nov 22, 2024Updated last year