hyperai / triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
☆60Updated 2 months ago
Alternatives and similar repositories for triton-cn:
Users that are interested in triton-cn are comparing it to the libraries listed below
- ☆124Updated 2 weeks ago
- ☆98Updated this week
- 使用 CUDA C++ 实现的 llama 模型推理框架☆48Updated 4 months ago
- Implement Flash Attention using Cute.☆71Updated 3 months ago
- ⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.☆59Updated 2 weeks ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆93Updated last year
- ☆74Updated 3 months ago
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆80Updated 2 months ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆35Updated last week
- ☆42Updated 2 months ago
- ☆78Updated last year
- ☆87Updated 6 months ago
- Examples of CUDA implementations by Cutlass CuTe☆145Updated last month
- ☆139Updated 10 months ago
- 📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.☆147Updated this week
- Optimize softmax in triton in many cases☆20Updated 6 months ago
- A light llama-like llm inference framework based on the triton kernel.☆100Updated last week
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆60Updated 7 months ago
- ☆145Updated 2 months ago
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆177Updated last month
- A minimalist and extensible PyTorch extension for implementing custom backend operators in PyTorch.☆33Updated 11 months ago
- ☆31Updated 7 months ago
- ☆113Updated last year
- ☆127Updated 2 months ago
- ☆45Updated this week
- [USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Paral…☆51Updated 7 months ago