qwen-nsa
β87Oct 14, 2025Updated 4 months ago
Alternatives and similar repositories for Qwen-Native-Sparse-Attention
Users that are interested in Qwen-Native-Sparse-Attention are comparing it to the libraries listed below
Sorting:
- β48Dec 13, 2025Updated 2 months ago
- π³ Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"β974Feb 5, 2026Updated last month
- β13Jan 7, 2025Updated last year
- Efficient triton implementation of Native Sparse Attention.β269May 23, 2025Updated 9 months ago
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inferenceβ57Nov 20, 2024Updated last year
- High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)β30Jan 22, 2026Updated last month
- A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attentionβ284Dec 1, 2025Updated 3 months ago
- DeepSeek Native Sparse Attention pytorch implementationβ114Dec 17, 2025Updated 2 months ago
- Xmixers: A collection of SOTA efficient token/channel mixersβ28Sep 4, 2025Updated 6 months ago
- Flash-Muon: An Efficient Implementation of Muon Optimizerβ239Jun 15, 2025Updated 8 months ago
- DeeperGEMM: crazy optimized versionβ74May 5, 2025Updated 10 months ago
- A sparse attention kernel supporting mix sparse patternsβ472Jan 18, 2026Updated last month
- β227Nov 19, 2025Updated 3 months ago
- Automated bottleneck detection and solution orchestrationβ19Feb 24, 2026Updated last week
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"β30Jun 30, 2025Updated 8 months ago
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMsβ194Sep 23, 2025Updated 5 months ago
- Expert Specialization MoE Solution based on CUTLASSβ27Jan 19, 2026Updated last month
- Website for CSE 234, Winter 2025β13Mar 24, 2025Updated 11 months ago
- β52May 19, 2025Updated 9 months ago
- An efficient implementation of the NSA (Native Sparse Attention) kernelβ129Jun 24, 2025Updated 8 months ago
- [ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoringβ269Jul 6, 2025Updated 8 months ago
- [ICLR 2025, IEEE TPAMI 2026] Mixture Compressor & MC#β68Feb 12, 2025Updated last year
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token promptβ¦β30Oct 21, 2024Updated last year
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUsβ60Mar 25, 2025Updated 11 months ago
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generationβ33Feb 26, 2026Updated last week
- Implement Flash Attention using Cute.β102Dec 17, 2024Updated last year
- β107Feb 25, 2025Updated last year
- Finetuning and inference tools for the CogView4 and CogVideoX model series.β118May 14, 2025Updated 9 months ago
- LLMζζ代η ειβ20Mar 25, 2025Updated 11 months ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scalingβ21Feb 9, 2026Updated last month
- Benchmark tests supporting the TiledCUDA library.β18Nov 19, 2024Updated last year
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inferenceβ161Oct 13, 2025Updated 4 months ago
- β118May 19, 2025Updated 9 months ago
- [ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inferenceβ283May 1, 2025Updated 10 months ago
- PyTorch implementation of the Flash Spectral Transform Unit.β22Sep 19, 2024Updated last year
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithmβ¦β102Aug 25, 2025Updated 6 months ago
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) servβ¦β275Updated this week
- Wave: Python Domain-Specific Language for High Performance Machine Learningβ45Updated this week
- Vortex: A Flexible and Efficient Sparse Attention Frameworkβ48Jan 21, 2026Updated last month