A dynamic binary instrumentation tool for tracing and analyzing CUDA kernel instructions.
☆48Apr 9, 2026Updated this week
Alternatives and similar repositories for CUTracer
Users that are interested in CUTracer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Luthier, a GPU binary instrumentation tool for AMD GPUs☆28Updated this week
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆35Dec 5, 2025Updated 4 months ago
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆198Updated this week
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆22Apr 25, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [WACV 2026] SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection☆16Updated this week
- MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI tr…☆94Updated this week
- A simple calculation for LLM MFU.☆76Sep 10, 2025Updated 7 months ago
- MOLA implementations of the virtual state estimation API for robots / vehicles☆11Apr 7, 2026Updated last week
- [RA-L] SHeRLoc: Synchronized Heterogeneous Radar Place Recognition for Cross-Modal Localization☆28Nov 24, 2025Updated 4 months ago
- NVidia sass disassembler/inline patcher☆64Updated this week
- Accelerating SDF gradient computation in NeuS-like multi-view reconstruction with directional finite difference (DFD) and patch-based sam…☆34Mar 24, 2024Updated 2 years ago
- Isolated Kalman Filtering C++ library☆18Dec 29, 2025Updated 3 months ago
- Script debugger for Grand Theft Auto V.☆20Dec 20, 2025Updated 3 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A Maven dependency graph generator for Bazel☆17Apr 4, 2026Updated last week
- Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)☆12Jun 20, 2025Updated 9 months ago
- A set of useful algebraic preconditioners for iterative numerical linear-algebraic methods.☆18Jul 23, 2022Updated 3 years ago
- [RA-L'24, IROS'24] Official PyTorch Implementation of "Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation"☆13Oct 11, 2024Updated last year
- ☆10May 12, 2022Updated 3 years ago
- Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.☆13Mar 20, 2025Updated last year
- Measures the conformance of a BPF runtime to the ISA.☆37Updated this week
- [TMLR 2025] Unifi3D: A Study on 3D Representations for Generation and Reconstruction in a Common Framework☆41Dec 17, 2025Updated 3 months ago
- diffusers with search engine☆12Jan 13, 2026Updated 3 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆14Mar 8, 2025Updated last year
- Framework for Algorithmic Correctness Testing of Operators☆17Mar 9, 2026Updated last month
- A WebAssembly eBPF runtime based on wasmtime in rust☆11Feb 20, 2023Updated 3 years ago
- A High performance and tiny TVM graph executor library written in C which can compile to WebAssembly and use CUDA/WebGPU as the accelerat…☆12Aug 3, 2023Updated 2 years ago
- Artifact for 'Register Optimizations for Stencils on GPUs'☆10Sep 18, 2018Updated 7 years ago
- naïve blockchain in Rust☆10Nov 13, 2020Updated 5 years ago
- ☆12Mar 7, 2024Updated 2 years ago
- Rohbau3D: A Shell Construction Site 3D Point Cloud Dataset☆26Jan 29, 2026Updated 2 months ago
- A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…☆106Feb 2, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Userspace eBPF Runtime Benchmarking Test Suite and Results☆16Updated this week
- Training and evaluation of a texture representation network.☆11Apr 22, 2020Updated 5 years ago
- Notes on optimizing the linux kernel function csum_partial☆14Nov 28, 2021Updated 4 years ago
- Causal Analysis of Agent Behavior for AI Safety☆20Jun 27, 2023Updated 2 years ago
- A scheduling framework for multitasking over diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs☆167Jan 13, 2026Updated 3 months ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆97Apr 7, 2026Updated last week
- Pytorch routines for (Ker)nel (Mac)hines☆11Oct 10, 2025Updated 6 months ago