deep-spin/adasplash

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/deep-spin/adasplash)

deep-spin / adasplash

AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)

☆46

Alternatives and similar repositories for adasplash

Users that are interested in adasplash are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
Ackesnal / RePaViT
View on GitHub
This is the official code for paper [RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward N…
☆18Jun 20, 2025Updated last year
Dao-AILab / grouped-latent-attention
View on GitHub
☆135May 29, 2025Updated last year
fasa-org / dash-attention
View on GitHub
DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention
☆21May 25, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
wdlctc / delta-attention-residuals-code
View on GitHub
Delta Attention Residuals - supplementary code and pretrained models
☆40May 20, 2026Updated 2 months ago
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
catswe / flash-attention-residuals
View on GitHub
Triton kernels and PyTorch ops for Block Attention Residuals (AttnRes)
☆86May 29, 2026Updated last month
mdy666 / Scalable-Flash-Native-Sparse-Attention
View on GitHub
☆48Dec 13, 2025Updated 7 months ago
SakanaAI / fast-weight-product-key-memory
View on GitHub
Code for Fast-weight Product Key Memory (FwPKM)
☆19Mar 18, 2026Updated 4 months ago
lemyx / tilelang-dsa
View on GitHub
DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelang
☆47Nov 19, 2025Updated 8 months ago
automl / is_mamba_capable_of_icl
View on GitHub
☆18Apr 24, 2024Updated 2 years ago
sumitramalagi / Unseen-classes-at-a-later-time
View on GitHub
☆13Apr 7, 2022Updated 4 years ago
Infini-AI-Lab / Sparrow
View on GitHub
☆16Jun 15, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
sjelassi / transformers_ssm_copy
View on GitHub
☆40Feb 26, 2024Updated 2 years ago
Yifei-Zuo / Parallax
View on GitHub
Official repository for Parallax (Parameterized Local Linear Attention)
☆65Jul 7, 2026Updated 2 weeks ago
Zehong-Wang / GPM
View on GitHub
Beyond Message Passing: Neural Graph Pattern Machine, ICML 2025
☆15May 28, 2025Updated last year
lucidrains / simplicial-attention
View on GitHub
Implementation of 2-simplicial attention proposed by Clift et al. (2019) and the recent attempt to make practical in Fast and Simplex, Ro…
☆49Sep 2, 2025Updated 10 months ago
xinghaow99 / pbs-attn
View on GitHub
[ICML 2026] Sparser Block-Sparse Attention via Token Permutation
☆31May 22, 2026Updated last month
fla-org / fla-zoo
View on GitHub
Flash-Linear-Attention models beyond language
☆21Aug 28, 2025Updated 10 months ago
LIONS-EPFL / scion
View on GitHub
☆70Apr 8, 2026Updated 3 months ago
OliverSieberling / dynamic-conv1d
View on GitHub
Triton kernels for dynamic causal short convolutions.
☆24Jun 4, 2026Updated last month
zhenyi4 / ssa
View on GitHub
Official repository for "SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space"
☆27May 7, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
IBM / selective-dense-state-space-model
View on GitHub
Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …
☆16Sep 18, 2025Updated 10 months ago
lasr-spelling / sae-spelling
View on GitHub
Code for the paper "A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders"
☆15Dec 28, 2025Updated 6 months ago
smonsays / hypernetwork-attention
View on GitHub
Official code for the paper "Attention as a Hypernetwork"
☆58Feb 24, 2026Updated 4 months ago
dhcode-cpp / Engram-pytorch
View on GitHub
pytorch implementation of DeepSeek Engram
☆19Mar 24, 2026Updated 3 months ago
Dao-AILab / gemm-cublas
View on GitHub
☆22May 5, 2025Updated last year
CLAIRE-Labo / RAT
View on GitHub
Official code for the NeurIPS25 paper "RAT: Bridging RNN Efficiencyand Attention Accuracy in Language Modeling" (https://arxiv.org/abs/25…
☆26Dec 10, 2025Updated 7 months ago
shiningsunnyday / induction
View on GitHub
Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages (ICML 2025) & Directed Graph G…
☆21Nov 11, 2025Updated 8 months ago
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
ElementAI / lagr
View on GitHub
LAGr: Label Aligned Graphs for Better Systematic Generalization in Semantic Parsing
☆10Jun 1, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hugorichard / ShICA
View on GitHub
☆13Feb 26, 2025Updated last year
ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
thinkwee / DDR_Bench
View on GitHub
Deep Data Research. Seek More, See Beyond.
☆16Feb 6, 2026Updated 5 months ago
norxornor / modded-nanogpt-jax
View on GitHub
NanoGPT speedrun in JAX. Originally at https://nor-git.pages.dev/modded-nanogpt-jax/
☆17Aug 28, 2025Updated 10 months ago
Infini-AI-Lab / gsm_infinite
View on GitHub
☆65Jun 12, 2025Updated last year
hanqi-qi / LLM_MetaReasoning
View on GitHub
☆15Jul 29, 2025Updated 11 months ago