Michalos88 / Randomized_SVD_in_CUDA
FAST Randomized SVD on a GPU with CUDA 🏎️
☆11Updated 5 years ago
Alternatives and similar repositories for Randomized_SVD_in_CUDA:
Users that are interested in Randomized_SVD_in_CUDA are comparing it to the libraries listed below
- Benchmarking OpenBLAS on the Apple M1☆18Updated 4 years ago
- FFTX Project☆23Updated 3 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆36Updated this week
- Reference implementation of the draft C++ GraphBLAS specification.☆30Updated 3 weeks ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆71Updated 3 weeks ago
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆21Updated 2 months ago
- CUDA Templates for Linear Algebra Subroutines☆14Updated this week
- Round matrix elements to lower precision in MATLAB☆36Updated 2 years ago
- Asynchronous I/O for HDF5☆21Updated last week
- ☆29Updated 5 years ago
- MATLAB Code for Parameters of Floating-Point Arithmetics☆8Updated 2 years ago
- ☆15Updated 5 months ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆43Updated last week
- Experimental plugin for scikit-learn to be able to run (some estimators) on Intel GPUs via numba-dpex.☆15Updated last year
- HIP Python Low-level Bindings☆19Updated last week
- associative floating point addition☆17Updated 10 months ago
- MLIR tools and dialect for GraphBLAS☆18Updated 2 years ago
- HiCMA: Hierarchical Computations on Manycore Architectures☆30Updated last year
- NPBench - A Benchmarking Suite for High-Performance NumPy☆78Updated this week
- MGARD: MultiGrid Adaptive Reduction of Data☆39Updated 2 months ago
- SYCL Reference Manual☆27Updated 10 months ago
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆31Updated 4 months ago
- The CUDA target for Numba☆70Updated this week
- Linnea is an experimental tool for the automatic generation of optimized code for linear algebra problems.☆68Updated 3 years ago
- Data and reproducibility scripts for the UoB-HPC Performance Portability studies☆15Updated 9 months ago
- Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction☆66Updated 5 months ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆30Updated 3 months ago
- Python CFFI Binding around SuiteSparse:GraphBLAS☆21Updated last week
- BLAS++ is a C++ wrapper around CPU and GPU BLAS (basic linear algebra subroutines), developed as part of the SLATE project.☆76Updated last week