facebookresearch/CUTracer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/CUTracer)

facebookresearch / CUTracer

A dynamic binary instrumentation tool for tracing and analyzing CUDA kernel instructions.

☆65

Alternatives and similar repositories for CUTracer

Users that are interested in CUTracer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

matinraayai / Luthier
View on GitHub
Luthier, a GPU binary instrumentation tool for AMD GPUs
☆28Updated this week
meta-pytorch / tritonparse
View on GitHub
TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels
☆203Updated this week
thib-s / flash-newton-schulz
View on GitHub
My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.
☆36Updated this week
FindHao / drgpu
View on GitHub
A Top-Down Profiler for GPU Applications
☆22Feb 29, 2024Updated 2 years ago
ChadLin9596 / ScenePoint-ETK
View on GitHub
[WACV 2026] SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection
☆16Apr 21, 2026Updated last week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
libreliu / ShaderPerFormer
View on GitHub
This is the repository for codes in paper "ShaderPerFormer: Platform-independent Context-aware Shader Performance Predictor"
☆12May 16, 2024Updated last year
meta-pytorch / MSLK
View on GitHub
MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI tr…
☆102Updated this week
CalvinXKY / mfu_calculation
View on GitHub
A simple calculation for LLM MFU.
☆77Sep 10, 2025Updated 7 months ago
hanjun815 / SHeRLoc
View on GitHub
[RA-L] SHeRLoc: Synchronized Heterogeneous Radar Place Recognition for Cross-Modal Localization
☆28Nov 24, 2025Updated 5 months ago
xucao-42 / Neuralangelo_DFD
View on GitHub
Accelerating SDF gradient computation in NeuS-like multi-view reconstruction with directional finite difference (DFD) and patch-based sam…
☆34Mar 24, 2024Updated 2 years ago
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated 9 months ago
phlippe / uvadlc_notebooks_benchmarking
View on GitHub
Benchmark scripts for comparing tutorials in PyTorch and JAX
☆14Aug 25, 2022Updated 3 years ago
aau-cns / ikf_lib
View on GitHub
Isolated Kalman Filtering C++ library
☆19Dec 29, 2025Updated 4 months ago
ShinyWasabi / scrDbg
View on GitHub
Script debugger for Grand Theft Auto V.
☆23Apr 26, 2026Updated last week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gmlwns2000 / sea-attention
View on GitHub
Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)
☆12Jun 20, 2025Updated 10 months ago
david-m-rosen / Preconditioners
View on GitHub
A set of useful algebraic preconditioners for iterative numerical linear-algebraic methods.
☆18Jul 23, 2022Updated 3 years ago
kaist-ami / Uni-DVPS
View on GitHub
[RA-L'24, IROS'24] Official PyTorch Implementation of "Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation"
☆13Oct 11, 2024Updated last year
MoZeWei / moTuner
View on GitHub
☆10May 12, 2022Updated 3 years ago
ezyang / ai-blindspots
View on GitHub
Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.
☆13Mar 20, 2025Updated last year
Alan-Jowett / bpf_conformance
View on GitHub
Measures the conformance of a BPF runtime to the ISA.
☆37Apr 25, 2026Updated last week
naturalatlas / mapbox-gl-native-node
View on GitHub
☆14Oct 6, 2020Updated 5 years ago
SJTU-IPADS / PhoenixOS
View on GitHub
Fast OS-level support for GPU checkpoint and restore
☆282Sep 28, 2025Updated 7 months ago
Deep-Learning-Profiling-Tools / triton-samples
View on GitHub
☆14Mar 8, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
meta-pytorch / FACTO
View on GitHub
Framework for Algorithmic Correctness Testing of Operators
☆17Mar 9, 2026Updated last month
eunomia-bpf / wasm-bpf-rs
View on GitHub
A WebAssembly eBPF runtime based on wasmtime in rust
☆11Feb 20, 2023Updated 3 years ago
amtp-protocol / agentry
View on GitHub
Orchestration and memory for multi-agent systems
☆14Feb 6, 2026Updated 2 months ago
yanghaku / tvm-rt-wasm
View on GitHub
A High performance and tiny TVM graph executor library written in C which can compile to WebAssembly and use CUDA/WebGPU as the accelerat…
☆12Aug 3, 2023Updated 2 years ago
pssrawat / ppopp-artifact
View on GitHub
Artifact for 'Register Optimizations for Stencils on GPUs'
☆10Sep 18, 2018Updated 7 years ago
ixxchan / nb
View on GitHub
naïve blockchain in Rust
☆10Nov 13, 2020Updated 5 years ago
vdesai2014 / diffusion-policy-accelerated
View on GitHub
☆12Mar 7, 2024Updated 2 years ago
futherus / mipt_lab
View on GitHub
Physics laboratory assignments
☆10Oct 5, 2024Updated last year
rancavil / tornado-openshift-quickstart
View on GitHub
Tornado Web Server git repository for OpenShift with Python 3.3
☆15Dec 13, 2015Updated 10 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
eunomia-bpf / bpf-benchmark
View on GitHub
Userspace eBPF Runtime Benchmarking Test Suite and Results
☆16Updated this week
zqljintu / Gameinterface-Doudizhu
View on GitHub
This is a game interface called the doudizhu by Qt,and I only imitated the interface simply.The object has thr function of random license…
☆12Sep 6, 2018Updated 7 years ago
YD1RUH / LoRa_EmComm
View on GitHub
An Implementation of LoRa for EmComm (Emergency Communication) or (TacComm) Tactical Communication
☆20Jul 23, 2025Updated 9 months ago
lisitsynSA / llvm_course
View on GitHub
LLVM passes and IR generators code examples
☆15Feb 12, 2026Updated 2 months ago
fenrus75 / csum_partial
View on GitHub
Notes on optimizing the linux kernel function csum_partial
☆14Nov 28, 2021Updated 4 years ago
google-deepmind / agent_debugger
View on GitHub
Causal Analysis of Agent Behavior for AI Safety
☆20Jun 27, 2023Updated 2 years ago
IaroslavElistratov / triton-autodiff
View on GitHub
☆18Nov 11, 2025Updated 5 months ago