zzd1992 / FlashWindowAttentionLinks

Speedup the attention computation of Swin Transformer

☆17

Alternatives and similar repositories for FlashWindowAttention

Users that are interested in FlashWindowAttention are comparing it to the libraries listed below

Sorting:

zugexiaodui / torch_flops
A library for calculating the FLOPs in the forward() process based on torch.fx
☆116Updated 2 months ago
OpenNLPLab / lightning-attention
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
☆317Updated 4 months ago
apple / ml-sigmoid-attention
☆286Updated 2 months ago
HazyResearch / flash-fft-conv
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
☆319Updated 6 months ago
LeapLabTHU / EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…
☆221Updated 10 months ago
mit-han-lab / sparsevit
[CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
☆72Updated last year
naver-ai / rope-vit
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
☆338Updated 6 months ago
VITA-Group / SLaK
[ICLR 2023] "More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity"; [ICML 2023] "Are Large Kernels Better Teachers…
☆274Updated last year
lucidrains / flash-cosine-sim-attention
Implementation of fused cosine similarity attention in the same style as Flash Attention
☆214Updated 2 years ago
karttikeya / minREV
A simple minimal implementation of Reversible Vision Transformers
☆125Updated last year
ivan-chai / torch-linear-assignment
Batch computation of the linear assignment problem on GPU.
☆83Updated 5 months ago
pytorch-labs / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆224Updated 10 months ago
SHI-Labs / NATTEN
Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
☆528Updated this week
kyleliang919 / C-Optim
When it comes to optimizers, it's always better to be safe than sorry
☆244Updated 2 months ago
Dao-AILab / fast-hadamard-transform
Fast Hadamard transform in CUDA, with a PyTorch interface
☆201Updated last year
fla-org / flame
🔥 A minimal training framework for scaling FLA models
☆178Updated 2 weeks ago
pytorch-labs / tritonbench
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆167Updated this week
megvii-research / RevCol
Official Code of Paper "Reversible Column Networks" "RevColv2"
☆263Updated last year
NVlabs / GatedDeltaNet
[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
☆174Updated 3 months ago
ziplab / SN-Net
[CVPR 2023 Highlight] This is the official implementation of "Stitchable Neural Networks".
☆247Updated 2 years ago
fawazsammani / awesome-mlp-mixer
Transformers w/o Attention, based fully on MLPs
☆93Updated last year
princeton-vl / Oriented1D
Official code for ICCV 2023 paper "Convolutional Networks with Oriented 1D Kernels"
☆46Updated last year
lucidrains / memory-efficient-attention-pytorch
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
☆379Updated last year
OscarXZQ / weight-selection
☆180Updated 9 months ago
nil0x9 / flash-muon
Flash-Muon: An Efficient Implementation of Muon Optimizer
☆133Updated 2 weeks ago
OpenGVLab / Vision-RWKV
[ICLR 2025 Spotlight] Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
☆476Updated 4 months ago
indri-voice / vit.triton
VIT inference in triton because, why not?
☆29Updated last year
VainF / Isomorphic-Pruning
[ECCV 2024] Isomorphic Pruning for Vision Models
☆69Updated 11 months ago
XunhaoLai / native-sparse-attention-triton
Efficient triton implementation of Native Sparse Attention.
☆168Updated last month
IST-DASLab / OBC
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
☆122Updated last year