meta-pytorch / krakenLinks
Triton-based Symmetric Memory operators and examples
☆48Updated last week
Alternatives and similar repositories for kraken
Users that are interested in kraken are comparing it to the libraries listed below
Sorting:
- extensible collectives library in triton☆90Updated 6 months ago
- ring-attention experiments☆155Updated last year
- PyTorch bindings for CUTLASS grouped GEMM.☆125Updated 5 months ago
- A bunch of kernels that might make stuff slower 😉☆63Updated this week
- Boosting 4-bit inference kernels with 2:4 Sparsity☆84Updated last year
- How to ensure correctness and ship LLM generated kernels in PyTorch☆107Updated last week
- QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning☆120Updated this week
- DeeperGEMM: crazy optimized version☆72Updated 5 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆84Updated last month
- ☆93Updated 11 months ago
- ☆35Updated this week
- Triton-based implementation of Sparse Mixture of Experts.☆246Updated 3 weeks ago
- ☆50Updated 5 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆78Updated last year
- Applied AI experiments and examples for PyTorch☆301Updated 2 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆264Updated this week
- ☆112Updated last year
- ☆130Updated 5 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆215Updated this week
- ☆242Updated this week
- This repository contains the experimental PyTorch native float8 training UX☆223Updated last year
- Collection of kernels written in Triton language