komrad36 / CUDALERPLinks
Fast CUDA (GPU) Bilinear and Nearest-Neighbor Interpolation at high accuracy - uint8_t data
☆13Updated 4 years ago
Alternatives and similar repositories for CUDALERP
Users that are interested in CUDALERP are comparing it to the libraries listed below
Sorting:
- Random number generator for large applications using vector instructions☆17Updated 9 years ago
- C++ convenience classes to be used with CUDA code, for both the host and the kerlel parts.☆55Updated 6 years ago
- Code samples from the presentation "What do you mean by 'Cache Friendly'?"☆24Updated 4 years ago
- "Hardware, Software, and Compilers! Oh My!" tutorial files☆16Updated 5 years ago
- C++ Coding Conventions and Guidelines☆46Updated 6 years ago
- Portable 128-bit SIMD intrinsics☆58Updated 2 years ago
- status-value - A class for status and optional value for C++11 and later, C++98 variant provided in a single-file header-only library☆16Updated last year
- A machine vision library written in SYCL and C++ that shows performance-portable implementation of graph algorithms☆162Updated last year
- Experimental ranges for CUDA☆24Updated 6 years ago
- Fastest CPU (AVX2) Bilinear and Nearest-Neighbor Interpolation: 25-100% faster than OpenCV. For computer vision / image processing.☆21Updated 4 years ago
- ☆68Updated 2 years ago
- A compile-time header-only C++17 library for dataflow programing.☆28Updated last year
- String to Float Benchmark☆19Updated 6 years ago
- ASM methods to test small loop performance on x86☆13Updated 6 years ago
- Header file to translate SSE instructions to ARM NEON instructions☆48Updated 11 years ago
- This repository contains components that will support percolation via OpenCL and CUDA☆32Updated 3 years ago
- Mirror of Agner Fog's C++ vector class library☆30Updated 5 years ago
- Benchmark supporting baseless libel against clang-format☆11Updated 5 years ago
- A sort wrapper enabling both use of random-access sorting on non-random access containers, and increased performance for the sorting of l…☆20Updated last week
- An Array Mapped Tree implementation☆10Updated 5 years ago
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆71Updated 9 years ago
- A universal thread-safe memory pool.☆26Updated 7 years ago
- Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task q…☆47Updated 8 months ago
- Fast multi-threaded line counter in Modern C++ (2-10x faster than `wc -l` for large files)☆18Updated 4 years ago
- devector and batch_deque containers for C++. See more at: http://erenon.hu/double_ended☆15Updated 7 years ago
- STL-like containers (array, vector, matrix, cube) useable in device code.☆31Updated last year
- A demonstration of tracing dynamic library loading and unloading on Linux.☆17Updated 8 years ago
- Benchmarking reading and parsing integers from a file in C++.☆9Updated 5 years ago
- A C/C++ task-based programming model for shared memory and distributed parallel computing.☆71Updated 4 years ago
- Highly composable C++17 template meta programming library☆39Updated 6 years ago