NVIDIA/NVTX

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVIDIA/NVTX)

NVIDIA / NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

☆547

Alternatives and similar repositories for NVTX

Users that are interested in NVTX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NVIDIA / nvbench
View on GitHub
CUDA Kernel Benchmarking Library
☆901Updated this week
NVIDIA / cccl
View on GitHub
CUDA Core Compute Libraries
☆2,435Updated this week
NVIDIA / jitify
View on GitHub
A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
☆573Sep 15, 2025Updated 10 months ago
NVIDIA / Fuser
View on GitHub
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆396May 31, 2026Updated last month
NVIDIA / cuCollections
View on GitHub
☆655Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
NVIDIA / gdrcopy
View on GitHub
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
☆1,399Updated this week
NVIDIA / HMM_sample_code
View on GitHub
CUDA 12.2 HMM demos
☆21Jul 26, 2024Updated last year
NVIDIA / nvbandwidth
View on GitHub
A tool for bandwidth measurements on NVIDIA GPUs.
☆734Updated this week
NVIDIA / cub
View on GitHub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,840Oct 9, 2023Updated 2 years ago
NVIDIA / compute-sanitizer-samples
View on GitHub
Samples demonstrating how to use the Compute Sanitizer Tools and Public API
☆99Nov 6, 2023Updated 2 years ago
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,104Updated this week
NVIDIA / cudnn-frontend
View on GitHub
cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source …
☆886Updated this week
rapidsai / rmm
View on GitHub
RAPIDS Memory Manager
☆705Updated this week
microsoft / NPKit
View on GitHub
NCCL Profiling Kit
☆155Jul 1, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
NVIDIA / multi-gpu-programming-models
View on GitHub
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
☆908Sep 26, 2025Updated 9 months ago
openucx / ucc
View on GitHub
Unified Collective Communication Library
☆310Updated this week
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,435Updated this week
NVIDIA / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆4,893Updated this week
dmlc / dlpack
View on GitHub
common in-memory tensor structure
☆1,232Jun 19, 2026Updated last month
pytorch / kineto
View on GitHub
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
☆974Updated this week
NVIDIA / CUDALibrarySamples
View on GitHub
CUDA Library Samples
☆2,463Updated this week
NVIDIA / nccl-tests
View on GitHub
NCCL Tests
☆1,595Jul 9, 2026Updated last week
yester31 / Cutlass_EX
View on GitHub
study of cutlass
☆22Nov 10, 2024Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
NVIDIA / cuda-samples
View on GitHub
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
☆9,406May 27, 2026Updated last month
NVIDIA / libcudacxx
View on GitHub
[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl
☆2,304Feb 7, 2024Updated 2 years ago
llvm / torch-mlir
View on GitHub
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
☆1,868Updated this week
Mellanox / nccl-rdma-sharp-plugins
View on GitHub
RDMA and SHARP plugins for nccl library
☆233Apr 3, 2026Updated 3 months ago
NVIDIA / DCGM
View on GitHub
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
☆763Jul 6, 2026Updated 2 weeks ago
tensorflow / mlir-hlo
View on GitHub
☆421Feb 24, 2026Updated 4 months ago
bryancatanzaro / trove
View on GitHub
Full-speed Array of Structures access
☆177Apr 25, 2023Updated 3 years ago
rapidsai / rapids-cmake
View on GitHub
☆47Updated this week
UoB-HPC / minifmm
View on GitHub
☆11Aug 8, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NVIDIA / AMGX
View on GitHub
Distributed multigrid linear solver library on GPU
☆680Jul 9, 2026Updated last week
Lin-Mao / DrGPUM
View on GitHub
A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
☆36May 30, 2026Updated last month
NVIDIA / cuda-gdb
View on GitHub
CUDA GDB
☆245Jun 1, 2026Updated last month
LeiWang1999 / tvm_gpu_gemm
View on GitHub
play gemm with tvm
☆91Jul 22, 2023Updated 2 years ago
ColfaxResearch / cutlass-kernels
View on GitHub
☆269Jul 11, 2024Updated 2 years ago
NVIDIA / PyProf
View on GitHub
A GPU performance profiling tool for PyTorch models
☆510Jul 13, 2021Updated 5 years ago
NVIDIA / cuda-python
View on GitHub
CUDA Python: Performance meets Productivity
☆3,320Updated this week