feifeibear / ChituAttentionLinks
Quantized Attention on GPU
☆44Updated last year
Alternatives and similar repositories for ChituAttention
Users that are interested in ChituAttention are comparing it to the libraries listed below
Sorting:
- ☆52Updated 8 months ago
- ☆117Updated 8 months ago
- ☆129Updated 5 months ago
- ☆65Updated 9 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Updated 6 months ago
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆32Updated last year
- Odysseus: Playground of LLM Sequence Parallelism☆79Updated last year
- A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆55Updated last year
- DeeperGEMM: crazy optimized version☆73Updated 8 months ago