torstem / demo-cuda-pybind11
How to use CUDA with Python numpy
☆38Updated 7 years ago
Alternatives and similar repositories for demo-cuda-pybind11:
Users that are interested in demo-cuda-pybind11 are comparing it to the libraries listed below
- Template for GPU accelerated python libraries☆48Updated last year
- Template for starting CUDA/C++ project using CMake with Github Action for CI☆29Updated 2 years ago
- Some CUDA design patterns and a bit of template magic for CUDA☆151Updated last year
- ☆22Updated 11 months ago
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆78Updated 9 months ago
- ☆44Updated 7 years ago
- Example of wrapping CGAL Delaunay triangulations and mesh refinement using pybind11☆43Updated 5 years ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆44Updated this week
- Exploring using stdpar and Cython☆33Updated 4 years ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆88Updated last year
- Example of using pybind11 with numpy and publishing to PyPI and conda-forge☆25Updated this week
- ☆11Updated 5 years ago
- MATLAB Code for Parameters of Floating-Point Arithmetics☆8Updated 3 years ago
- ☆58Updated 8 months ago
- Conjugate Gradient for Least Squares in CUDA☆52Updated 9 years ago
- GPU-Accelerated multigrid solver for Poisson's equation in 2D☆22Updated 4 years ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- Fast zero-overhead bindings between NumPy and Eigen☆139Updated last month
- Local and distributed octrees based on Morton codes with halo discovery and exchange with a 3D collision detection algorithm☆42Updated 3 months ago
- A nanobind example project☆102Updated last month
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- Implementation of ConjugateGradients method using C and Nvidia CUDA☆51Updated 2 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- This is a c++ port initially performed by Luis Ibanez of the LSQR library of Chris Paige and Michael Saunders. The same methodology was a…☆23Updated 3 months ago
- This example builds on the parallel-forall repo separate compilation example by adding CMake to it.☆17Updated 7 years ago
- MWE for using the Eigen library in CUDA kernels☆119Updated 2 years ago
- GPU accelerated multigrid library for Python☆59Updated 7 months ago
- Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletoni…☆13Updated 4 years ago
- BGHT: High-performance static GPU hash tables.☆63Updated last month
- VolSiM, a CNN-based metric to compute the similarity of 3D data from numerical simulations☆15Updated last year