roguh / cuda-fftLinks
Yet another FFT implementation in CUDA. Includes benchmarks using simple data for comparing different implementations.
☆11Updated 4 years ago
Alternatives and similar repositories for cuda-fft
Users that are interested in cuda-fft are comparing it to the libraries listed below
Sorting:
- Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the deve…☆13Updated 7 years ago
- Examples from Programming in Parallel with CUDA☆161Updated 2 years ago
- ☆68Updated 11 years ago
- Main Book repository for the Parallel and High Performance Computing book, Manning Publications☆213Updated 3 years ago
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆439Updated 2 years ago
- ☆461Updated 10 years ago
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆298Updated 2 weeks ago
- ☆282Updated 4 years ago
- EDSL for PDE solver composing☆77Updated 3 months ago
- Fast Fourier Transform implementation, computable on CUDA platform. Seminar project for MI-PRC course at FIT CTU.☆38Updated 2 years ago
- BLISlab: A Sandbox for Optimizing GEMM☆536Updated 4 years ago
- Sample code from the book "Professional CUDA C Programming"☆39Updated 2 years ago
- Source code that accompanies The CUDA Handbook.☆539Updated 7 months ago
- Online CUDA Occupancy Calculator☆80Updated 3 years ago
- 为 Eijhout 教授的Introduction to HPC提供中文翻译、 PPT和Lab。☆323Updated 3 years ago
- Xiao's CUDA Optimization Guide [NO LONGER ADDING NEW CONTENT]☆313Updated 2 years ago
- This is an implementation of sgemm_kernel on L1d cache.☆229Updated last year
- Intel AVX-512简介☆51Updated last year
- 2018 并行计算课程 repo☆32Updated 4 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆140Updated last year
- supplementary material/programming exercises☆73Updated 3 years ago
- Numerical linear algebra software package☆505Updated this week
- Google Colab Notebooks for Udacity CS344 - Intro to Parallel Programming☆134Updated 4 years ago
- a c++/cuda template library for tensor lazy evaluation☆163Updated 2 years ago
- Step-by-step optimization of CUDA SGEMM☆375Updated 3 years ago
- A simple high performance CUDA GEMM implementation.☆406Updated last year
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆796Updated 6 months ago
- This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…☆1,148Updated 2 years ago
- IMPACT GPU Algorithms Teaching Labs☆58Updated 2 years ago
- Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)☆856Updated last year