bryancatanzaro/inplace

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bryancatanzaro/inplace)

bryancatanzaro / inplace

CUDA and OpenMP implementations of C2R/R2C inplace transposition

☆49

Alternatives and similar repositories for inplace

Users that are interested in inplace are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

klho / PyMatrixID
View on GitHub
Fast interpolative decompositions in Python
☆10Jan 4, 2021Updated 5 years ago
bryancatanzaro / trove
View on GitHub
Full-speed Array of Structures access
☆177Apr 25, 2023Updated 3 years ago
NaoyukiIchimura / cuda_image_filtering_global
View on GitHub
☆11Dec 5, 2018Updated 7 years ago
Samsung / veles.simd
View on GitHub
Distributed machine learning platform
☆13Aug 20, 2015Updated 10 years ago
benoitsteiner / tensorflow-xsmm
View on GitHub
Improved performance for TensorFlow on Intel hardware.
☆13Jun 25, 2018Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
timholy / Cartesian.jl
View on GitHub
Fast multidimensional algorithms
☆18Feb 8, 2020Updated 6 years ago
solomonik / CANDMC
View on GitHub
Communication Avoiding Numerical Dense Matrix Computations
☆11Dec 20, 2020Updated 5 years ago
kurocha / concurrent
View on GitHub
☆13May 6, 2023Updated 3 years ago
cudarrays / cudarrays
View on GitHub
Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications
☆28Nov 27, 2016Updated 9 years ago
HPAC / tccg
View on GitHub
Tensor Contraction Code Generator
☆40Aug 14, 2017Updated 8 years ago
cslab-ntua / sparsex
View on GitHub
The SparseX sparse kernel optimization library
☆43Jan 16, 2019Updated 7 years ago
Metadiff / gir
View on GitHub
Graph Intermediate Representation (GIR) library for ML
☆24Mar 18, 2017Updated 9 years ago
p12tic / libbittwiddle
View on GitHub
A collection of bit manipulation routines for C++
☆21Jul 24, 2013Updated 13 years ago
mkmik / metacontext
View on GitHub
playground for creating new statement and constructs in python using import hooks
☆15Apr 1, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
vincent-maillou / serinv
View on GitHub
Selected Decomposition Routines
☆23Apr 20, 2026Updated 3 months ago
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
sanshar / StackBlock
View on GitHub
☆11Mar 13, 2021Updated 5 years ago
malbrain / rwlock
View on GitHub
Phase Fair and Standard Reader Writer Locks
☆16Sep 16, 2019Updated 6 years ago
hfp / libxstream
View on GitHub
Library and accelerator backend
☆15Updated this week
bryancatanzaro / kmeans
View on GitHub
kmeans
☆54Jul 6, 2016Updated 10 years ago
flame / tblis-strassen
View on GitHub
Strassen's Algorithm for Tensor Contraction
☆15Jul 7, 2017Updated 9 years ago
springer13 / hptt
View on GitHub
High-Performance Tensor Transpose library
☆205May 13, 2023Updated 3 years ago
quettabit / convolution_kernel
View on GitHub
Accelerating CNN's convolution operation on GPUs by using memory-efficient data access patterns.
☆14Dec 8, 2017Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ararslan / julia-rs
View on GitHub
Call Julia from Rust
☆16Dec 8, 2016Updated 9 years ago
alpaka-group / mallocMC
View on GitHub
mallocMC: Memory Allocator for Many Core Architectures
☆58Jul 14, 2026Updated 2 weeks ago
ColfaxResearch / FALCON
View on GitHub
Library for fast image convolution in neural networks on Intel Architecture
☆30Jun 25, 2017Updated 9 years ago
lutnn / blink-mm
View on GitHub
☆16Jul 24, 2023Updated 3 years ago
NVIDIA / mpi-acx
View on GitHub
MPI accelerator-integrated communication extensions
☆39Apr 4, 2023Updated 3 years ago
nmayhall-vt / FermiCG
View on GitHub
☆11Jun 11, 2026Updated last month
Warlocat / x2camf
View on GitHub
SOC integrals generator with atomic mean field approximation
☆11Apr 26, 2026Updated 3 months ago
AnonymousYWL / MYLIB
View on GitHub
☆18Apr 8, 2022Updated 4 years ago
ScreamingDev / git-time
View on GitHub
Estimate time you took on an branch, path or whole history
☆16Jan 17, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ap-hynninen / cutt
View on GitHub
CUDA Tensor Transpose (cuTT) library
☆55Aug 10, 2017Updated 8 years ago
ofuhrer / HPC4WC
View on GitHub
High Performance Computing for Weather and Climate
☆46Jun 26, 2026Updated last month
mpip / pnfft
View on GitHub
Parallel nonequispaced fast Fourier transforms
☆16Jun 4, 2018Updated 8 years ago
ROCm / AITemplate
View on GitHub
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…
☆12Jun 24, 2024Updated 2 years ago
CNugteren / CLTune
View on GitHub
CLTune: An automatic OpenCL & CUDA kernel tuner
☆186Dec 12, 2022Updated 3 years ago
ladamalina / coursera-algo
View on GitHub
Programming Questions (July 2013)
☆11Apr 4, 2015Updated 11 years ago
art4711 / random-double
View on GitHub
An algorithm for generating random doubles.
☆13Feb 21, 2017Updated 9 years ago