graphcore-research / gfloatLinks
Generic floating-point types in Python
☆16Updated last month
Alternatives and similar repositories for gfloat
Users that are interested in gfloat are comparing it to the libraries listed below
Sorting:
- Customized matrix multiplication kernels☆57Updated 3 years ago
- ☆55Updated last year
- Butterfly matrix multiplication in PyTorch☆178Updated 2 years ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.☆50Updated 7 years ago
- ☆16Updated last year
- Statistics on GPUs☆33Updated 4 months ago
- GEMM and Winograd based convolutions using CUTLASS☆28Updated 5 years ago
- Experiment of using Tangent to autodiff triton☆82Updated 2 years ago
- Worked example of the process from Python source to CUDA kernel execution with Numba☆45Updated last year
- Fast matrix multiplication for few-bit integer matrices on CPUs.☆28Updated 6 years ago
- The simplest but fast implementation of matrix multiplication in CUDA.☆40Updated last year
- ☆49Updated last year
- Sparsity support for PyTorch☆38Updated 10 months ago
- CUDA-accelerated minimum spanning tree algorithm -- data parallel Boruvka's algorithm☆21Updated 9 years ago
- cuASR: CUDA Algebra for Semirings☆44Updated 3 years ago
- PyTorch interface for the IPU☆181Updated 2 years ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆328Updated last week
- A lightweight, Pythonic, frontend for MLIR☆81Updated 2 years ago
- A Data-Centric Compiler for Machine Learning☆85Updated last month
- ☆345Updated last week
- ☆82Updated last year
- muSYCL, the SYCL musical!☆13Updated last year
- ☆40Updated 2 years ago
- High-Performance FP32 GEMM on CUDA devices☆117Updated last year
- ☆21Updated 3 years ago
- Tokamax: A GPU and TPU kernel library.☆169Updated last week
- FlexAttention w/ FlashAttention3 Support☆27Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- [ICASSP'22] Integer-only Zero-shot Quantization for Efficient Speech Recognition☆34Updated 4 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆49Updated 5 months ago