CUDA implementation of autoregressive linear attention, with all the latest research findings
☆46May 23, 2023Updated 3 years ago
Alternatives and similar repositories for autoregressive-linear-attention-cuda
Users that are interested in autoregressive-linear-attention-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆47Jul 16, 2023Updated 2 years ago
- Explorations into the recently proposed Taylor Series Linear Attention☆100Aug 18, 2024Updated last year
- Implementation of GateLoop Transformer in Pytorch and Jax☆92Jun 18, 2024Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆122Oct 17, 2024Updated last year
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆98Oct 20, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆92Dec 22, 2023Updated 2 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- Implementation of a simple BPE tokenizer, but in Nim☆22Jul 2, 2023Updated 2 years ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Implementation of a multimodal diffusion transformer in Pytorch☆108Jun 24, 2024Updated last year
- A simple implementation of a deep linear Pytorch module☆21Oct 16, 2020Updated 5 years ago
- Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2☆15Jun 27, 2025Updated 11 months ago
- My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other h…☆54Jul 2, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆87Nov 1, 2025Updated 7 months ago
- Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"☆92Oct 13, 2023Updated 2 years ago
- Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in P…☆208Feb 14, 2024Updated 2 years ago
- Implementation of the Llama architecture with RLHF + Q-learning☆170Feb 1, 2025Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆28Updated this week
- ☆24Jun 18, 2024Updated last year
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆81Dec 4, 2022Updated 3 years ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21May 12, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Utilities for PyTorch distributed☆25Feb 27, 2025Updated last year
- RWKV-7 mini☆12Mar 29, 2025Updated last year
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆126Jul 26, 2024Updated last year
- Generate python ctypes classes from C headers. Requires LLVM clang☆15Aug 14, 2024Updated last year
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Oct 22, 2023Updated 2 years ago
- Simple python library for generating your own perfetto traces for your application. Can be used for both app instrumentation and custom …☆26Jun 22, 2025Updated 11 months ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- A vast array of Multi-Modal Embodied Robotic Foundation Models!☆28Mar 18, 2024Updated 2 years ago
- https://hf.co/hexgrad/Kokoro-82M☆14Jan 14, 2026Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Jun 3, 2024Updated 2 years ago
- Hacks for PyTorch☆19Apr 18, 2023Updated 3 years ago
- Explorations into some recent techniques surrounding speculative decoding☆305Dec 22, 2024Updated last year
- ☆21Mar 3, 2025Updated last year
- ☆28Aug 10, 2023Updated 2 years ago
- ☆34Sep 10, 2024Updated last year
- ☆15Nov 24, 2025Updated 6 months ago