NVlabs/cub

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVlabs/cub)

NVlabs / cub

THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.

☆87

Alternatives and similar repositories for cub

Users that are interested in cub are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NVIDIA / cub
View on GitHub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,840Oct 9, 2023Updated 2 years ago
venovako / GPUJACHx
View on GitHub
The Jacobi-type (hyperbolic) SVD for CUDA.
☆10Feb 16, 2026Updated 5 months ago
lightsighter / CudaDMA
View on GitHub
Emulating DMA Engines on GPUs for Performance and Portability
☆43May 17, 2015Updated 11 years ago
ChandlerGuan / Transkimmer
View on GitHub
Code for ACL2022 publication Transkimmer: Transformer Learns to Layer-wise Skim
☆22Aug 21, 2022Updated 3 years ago
adwaitjog / mafia
View on GitHub
MAFIA: Multiple Application Framework for GPU architectures
☆28Jan 21, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
sandialabs / spack-manager
View on GitHub
A project and machine deployment model using Spack
☆31Jul 7, 2026Updated 2 weeks ago
hkust-adsl / gass
View on GitHub
☆43Apr 3, 2022Updated 4 years ago
kwantam / fffft
View on GitHub
fft impl for ff::Field
☆17May 9, 2024Updated 2 years ago
AnonymousYWL / MYLIB
View on GitHub
☆18Apr 8, 2022Updated 4 years ago
HardwareIR / netlistDB
View on GitHub
netlistDB - Intermediate format for digital hardware representation with graph database API
☆31Mar 17, 2021Updated 5 years ago
NVIDIA-developer-blog / code-samples
View on GitHub
Source code examples from the Parallel Forall Blog
☆1,332Sep 23, 2025Updated 10 months ago
illuhad / syclinfo
View on GitHub
List all available information about all SYCL devices and platforms
☆15Sep 14, 2020Updated 5 years ago
billmuch / matmul_perf_test
View on GitHub
☆15Apr 15, 2022Updated 4 years ago
Emma926 / mcbench
View on GitHub
Mille Crepe Bench: layer-wise performance analysis for deep learning frameworks.
☆18Oct 22, 2019Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
vegaluisjose / vast
View on GitHub
Verilog AST
☆21Dec 2, 2023Updated 2 years ago
fvalasiad / devector
View on GitHub
C++11 Header-only continuous-storage Double ended vector implementation similar to STL's std::vector for efficient insertions/removals at…
☆16Dec 29, 2022Updated 3 years ago
kumasento / polymer
View on GitHub
Bridging polyhedral analysis tools to the MLIR framework
☆119Sep 9, 2023Updated 2 years ago
milakov / int_fastdiv
View on GitHub
Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.
☆75Nov 4, 2015Updated 10 years ago
MessyPaste / CenterNet_for_2080ti
View on GitHub
change centernet codes to adjust to nvidia rtx 2080ti with pytorch 1.2.x
☆15Apr 30, 2020Updated 6 years ago
MostaphaG / Summer_project-df
View on GitHub
Python GUI for differential forms
☆13Oct 14, 2023Updated 2 years ago
openrisc / or1k_marocchino
View on GitHub
OpenRISC processor IP core based on Tomasulo algorithm
☆36Feb 18, 2022Updated 4 years ago
arodchen / MaxSim
View on GitHub
A simulation platform for managed applications based on Maxine VM and ZSim
☆29Jun 19, 2017Updated 9 years ago
brevzin / void
View on GitHub
Regular Void: The Library
☆10May 10, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
wq2012 / CurriculumVitae
View on GitHub
Curriculum Vitae of Quan Wang
☆15Dec 13, 2025Updated 7 months ago
llnl / FPChecker
View on GitHub
A dynamic analysis tool to detect floating-point errors in HPC applications.
☆42Jul 17, 2026Updated last week
learncompiler / compiler-lectures
View on GitHub
☆16Jul 21, 2020Updated 6 years ago
harvard-edge / Gables
View on GitHub
☆15Apr 3, 2020Updated 6 years ago
uchuhimo / amanda
View on GitHub
☆18Apr 21, 2024Updated 2 years ago
peiswang / BitSplit
View on GitHub
BitSplit Post-trining Quantization
☆49Dec 20, 2021Updated 4 years ago
b-flo / warp-transducer
View on GitHub
A fast parallel implementation of RNN Transducer.
☆12Apr 8, 2025Updated last year
codyjrivera / tsm2x-imp
View on GitHub
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆35Jul 28, 2020Updated 5 years ago
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,073Jan 3, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
bstamour / units
View on GitHub
Using C++ templates to track dimensional metadata
☆11Nov 20, 2020Updated 5 years ago
Xilinx / mlir-xten
View on GitHub
☆17Updated this week
hanchenye / polyaie
View on GitHub
An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE
☆17Aug 5, 2022Updated 3 years ago
spcl / open-earth-compiler
View on GitHub
development repository for the open earth compiler
☆82Feb 19, 2021Updated 5 years ago
Yu-Maryland / Differentiable_Scheduler_ICML24
View on GitHub
Differentiable Combinatorial Scheduling at Scale (ICML'24). Mingju Liu, Yingjie Li, Jiaqi Yin, Zhiru Zhang, Cunxi Yu.
☆22Oct 31, 2024Updated last year
Sibylau / HLS_designs
View on GitHub
Systolic array implementations for Cholesky, LU, and QR decomposition
☆50Nov 12, 2024Updated last year
pku-dasys / easymac
View on GitHub
☆18Feb 3, 2022Updated 4 years ago