MayDomine/Burst-Attention

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MayDomine/Burst-Attention)

MayDomine / Burst-Attention

Distributed IO-aware Attention algorithm

☆24

Alternatives and similar repositories for Burst-Attention

Users that are interested in Burst-Attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MayDomine / Seq1F1B
View on GitHub
Sequence-level 1F1B schedule for LLMs.
☆19Jun 4, 2024Updated 2 years ago
thunlp / Seq1F1B
View on GitHub
Sequence-level 1F1B schedule for LLMs.
☆37Aug 26, 2025Updated 11 months ago
leimao / Nsight-Compute-Docker-Image
View on GitHub
Nsight Compute In Docker
☆13Dec 21, 2023Updated 2 years ago
kwai / Megatron-Kwai
View on GitHub
LLM training technologies developed by kwai
☆71Jun 30, 2026Updated 3 weeks ago
RulinShao / LightSeq
View on GitHub
Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training
☆223Aug 19, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tlc-pack / libflash_attn
View on GitHub
Standalone Flash Attention v2 kernel without libtorch dependency
☆113Sep 10, 2024Updated last year
heheda12345 / MagPy
View on GitHub
☆41Jun 5, 2024Updated 2 years ago
xlite-dev / ffpa-attn
View on GitHub
🤖FFPA: Extends FA-2/3 via Split-D for large headdims, 1.5x~6×↑🎉 vs SDPA, up to 513~535 TFLOPS🎉 on NVIDIA H200.
☆318Updated this week
sail-sg / zero-bubble-pipeline-parallelism
View on GitHub
Zero Bubble Pipeline Parallelism
☆464May 7, 2025Updated last year
feifeibear / long-context-attention
View on GitHub
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
☆683May 21, 2026Updated 2 months ago
zhuzilin / ring-flash-attention
View on GitHub
Ring attention implementation with flash attention
☆1,038Sep 10, 2025Updated 10 months ago
crispyberry / MLIR-Pass-Tour
View on GitHub
☆11Feb 28, 2023Updated 3 years ago
Karbo123 / pytorch_grouped_gemm
View on GitHub
High Performance Grouped GEMM in PyTorch
☆30May 10, 2022Updated 4 years ago
ademeure / cuda-side-boost
View on GitHub
☆60Feb 24, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cchan / tccl
View on GitHub
extensible collectives library in triton
☆97Mar 31, 2025Updated last year
mandliya / PMPP_notes
View on GitHub
Notes and code for Programming Massively Parallel Processors
☆13Mar 29, 2025Updated last year
zms1999 / SmartMoE
View on GitHub
A MoE impl for PyTorch, [ATC'23] SmartMoE
☆73Jul 11, 2023Updated 3 years ago
tsinghua-ideal / Canvas
View on GitHub
Canvas: End-to-End Kernel Architecture Search in Neural Networks
☆27Nov 18, 2024Updated last year
NormXU / Consistent-DynamicNTKRoPE
View on GitHub
An Experiment on Dynamic NTK Scaling RoPE
☆65Nov 26, 2023Updated 2 years ago
Youhe-Jiang / IJCAI2023-OptimalShardedDataParallel
View on GitHub
[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…
☆52May 31, 2023Updated 3 years ago
3rdparty / stout-borrowed-ptr
View on GitHub
C++ "borrowing" smart pointer.
☆10May 13, 2022Updated 4 years ago
DeepLink-org / DLOP-Bench
View on GitHub
A benchmark suited especially for deep learning operators
☆42Feb 13, 2023Updated 3 years ago
zexuanqiu / CLongEval
View on GitHub
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
☆49Mar 7, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
October2001 / ProLong
View on GitHub
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆61Jul 23, 2024Updated 2 years ago
Arktische / HUST-Survival-Book
View on GitHub
☆22May 4, 2022Updated 4 years ago
pkunlp-icler / MLS
View on GitHub
Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022
☆13Apr 13, 2022Updated 4 years ago
AmberLJC / Sci-Reasoning
View on GitHub
Sci-Reasoning: A Dataset Decoding AI Innovation Patterns
☆19Jan 13, 2026Updated 6 months ago
syncdoth / Chain-of-Hindsight-PyTorch
View on GitHub
Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.
☆11Apr 5, 2023Updated 3 years ago
LeviViana / torchessian
View on GitHub
Full loss Hessian spectrum approximation tool.
☆13Oct 9, 2019Updated 6 years ago
M3-IT / YING-VLM
View on GitHub
Vision Large Language Models trained on M3IT instruction tuning dataset
☆17Aug 16, 2023Updated 2 years ago
ZenithalHourlyRate / naming
View on GitHub
☆11Apr 29, 2024Updated 2 years ago
causalNLP / amr_llm
View on GitHub
This repo explores how AMR to address tasks difficult for LLMs
☆13Jan 15, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
hky1999 / Unishyper
View on GitHub
A Rust-based Unikernel Enhancing Reliability and Efficiency of Embedded Systems.
☆12Jun 28, 2024Updated 2 years ago
EricLee8 / MPD_EMVI
View on GitHub
Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference
☆10Jul 10, 2023Updated 3 years ago
chenllliang / ParetoMNMT
View on GitHub
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆17Sep 27, 2023Updated 2 years ago
lifan-yuan / FactMix
View on GitHub
Code for COLING 2022 paper "FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition"
☆15Jan 15, 2023Updated 3 years ago
StefanHeng / ProgGen
View on GitHub
Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"
☆17Mar 29, 2024Updated 2 years ago
EthanZhangYC / OD-cheap-convolution
View on GitHub
PyTorch implementation for OD-cheap-convolution.
☆20Sep 29, 2019Updated 6 years ago
ThisIsHwang / EXIT
View on GitHub
Official code and resources for the paper "EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation."
☆25Jul 15, 2026Updated 2 weeks ago