HicrestLaboratory / SPARTALinks
SParse AcceleRation on Tensor Architecture
☆18Updated 9 months ago
Alternatives and similar repositories for SPARTA
Users that are interested in SPARTA are comparing it to the libraries listed below
Sorting:
- A GPU performance prediction toolkit for CUDA programs☆18Updated 6 years ago
- Sparsity support for PyTorch☆38Updated 10 months ago
- cuASR: CUDA Algebra for Semirings☆44Updated 3 years ago
- Julia ports of the Rodinia benchmark suite for heterogeneous computing infrastructures☆55Updated 2 years ago
- ☆28Updated last year
- ☆16Updated last year
- MagmaDNN: a simple deep learning framework in c++☆51Updated 5 years ago
- A Data-Centric Compiler for Machine Learning☆85Updated last month
- Research and development for optimizing transformers☆131Updated 4 years ago
- ☆11Updated 4 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆137Updated 3 years ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.☆50Updated 7 years ago
- COCCL: Compression and precision co-aware collective communication library☆29Updated 10 months ago
- ☆20Updated 6 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 4 years ago
- A library of GPU kernels for sparse matrix operations.☆283Updated 5 years ago
- ParaDnn: A systematic performance analysis methodology for deep learning.☆40Updated 5 years ago
- Unit Scaling demo and experimentation code☆16Updated last year
- Poplar libraries☆122Updated 2 years ago
- ☆55Updated last year
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆41Updated last year
- Mille Crepe Bench: layer-wise performance analysis for deep learning frameworks.☆18Updated 6 years ago
- This is the open source version of HPL-MXP. The code performance has been verified on Frontier☆18Updated 6 months ago
- Benchmarking OpenBLAS on the Apple M1☆18Updated 5 years ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆91Updated last week
- extensible collectives library in triton☆93Updated 10 months ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆105Updated 7 months ago
- Loop Nest - Linear algebra compiler and code generator.☆21Updated 3 years ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆65Updated 3 years ago
- ☆15Updated 2 months ago