spcl / stenLinks
Sparsity support for PyTorch
☆37Updated 5 months ago
Alternatives and similar repositories for sten
Users that are interested in sten are comparing it to the libraries listed below
Sorting:
- ☆41Updated last year
- extensible collectives library in triton☆87Updated 5 months ago
- Distributed SDDMM Kernel☆11Updated 3 years ago
- ☆23Updated last month
- ☆28Updated 8 months ago
- A Data-Centric Compiler for Machine Learning☆84Updated last year
- ☆111Updated last year
- A library of GPU kernels for sparse matrix operations.☆272Updated 4 years ago
- Experiment of using Tangent to autodiff triton☆81Updated last year
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆97Updated 2 months ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆134Updated 3 years ago
- ☆88Updated 10 months ago
- ☆16Updated 11 months ago
- A bunch of kernels that might make stuff slower 😉☆59Updated this week
- ☆37Updated 2 weeks ago
- cuASR: CUDA Algebra for Semirings☆39Updated 3 years ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.☆51Updated 7 years ago
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆53Updated last month
- Collection of kernels written in Triton language☆154Updated 5 months ago
- ☆18Updated 5 years ago
- Memory Optimizations for Deep Learning (ICML 2023)☆107Updated last year
- How to ensure correctness and ship LLM generated kernels in PyTorch☆58Updated this week
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆138Updated 2 years ago
- Personal solutions to the Triton Puzzles☆20Updated last year
- Distributed K-FAC preconditioner for PyTorch☆90Updated this week
- Triton-based Symmetric Memory operators and examples☆28Updated this week
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆40Updated 2 years ago
- Github mirror of trition-lang/triton repo.☆73Updated this week
- Research and development for optimizing transformers☆130Updated 4 years ago
- A parallel framework for training deep neural networks☆63Updated 6 months ago