zzd1992 / FlashWindowAttentionLinks
Speedup the attention computation of Swin Transformer
☆17Updated 2 weeks ago
Alternatives and similar repositories for FlashWindowAttention
Users that are interested in FlashWindowAttention are comparing it to the libraries listed below
Sorting:
- A library for calculating the FLOPs in the forward() process based on torch.fx☆116Updated 2 months ago
- Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models☆317Updated 4 months ago
- ☆286Updated 2 months ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆319Updated 6 months ago
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆221Updated 10 months ago
- [CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer☆72Updated last year
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆338Updated 6 months ago
- [ICLR 2023] "More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity"; [ICML 2023] "Are Large Kernels Better Teachers…☆274Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆214Updated 2 years ago
- A simple minimal implementation of Reversible Vision Transformers☆125Updated last year
- Batch computation of the linear assignment problem on GPU.☆83Updated 5 months ago
- This repository contains the experimental PyTorch native float8 training UX☆224Updated 10 months ago
- Neighborhood Attention Extension. Bringing attention to a neighborhood near you!☆528Updated this week
- When it comes to optimizers, it's always better to be safe than sorry☆244Updated 2 months ago
- Fast Hadamard transform in CUDA, with a PyTorch interface☆201Updated last year
- 🔥 A minimal training framework for scaling FLA models☆178Updated 2 weeks ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆167Updated this week
- Official Code of Paper "Reversible Column Networks" "RevColv2"☆263Updated last year
- [ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆174Updated 3 months ago
- [CVPR 2023 Highlight] This is the official implementation of "Stitchable Neural Networks".☆247Updated 2 years ago
- Transformers w/o Attention, based fully on MLPs☆93Updated last year
- Official code for ICCV 2023 paper "Convolutional Networks with Oriented 1D Kernels"☆46Updated last year
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆379Updated last year
- ☆180Updated 9 months ago
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆133Updated 2 weeks ago
- [ICLR 2025 Spotlight] Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures☆476Updated 4 months ago
- VIT inference in triton because, why not?☆29Updated last year
- [ECCV 2024] Isomorphic Pruning for Vision Models☆69Updated 11 months ago
- Efficient triton implementation of Native Sparse Attention.☆168Updated last month
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆122Updated last year