A dynamic binary instrumentation tool for tracing and analyzing CUDA kernel instructions.
☆67Jun 12, 2026Updated this week
Alternatives and similar repositories for CUTracer
Users that are interested in CUTracer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆212Jun 10, 2026Updated last week
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆38Apr 30, 2026Updated last month
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆21Apr 25, 2024Updated 2 years ago
- [WACV 2026] SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection☆18Jun 7, 2026Updated last week
- This is the repository for codes in paper "ShaderPerFormer: Platform-independent Context-aware Shader Performance Predictor"☆12May 16, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- MSLK (Meta Superintelligence Labs Kernels) is a collection of PyTorch GPU operator libraries that are designed and optimized for GenAI tr…☆108Updated this week
- NVidia sass disassembler/inline patcher☆85Updated this week
- Accelerating SDF gradient computation in NeuS-like multi-view reconstruction with directional finite difference (DFD) and patch-based sam…☆34Mar 24, 2024Updated 2 years ago
- An experimental communicating attention kernel based on DeepEP.☆34Jul 29, 2025Updated 10 months ago
- A set of useful algebraic preconditioners for iterative numerical linear-algebraic methods.☆18Jul 23, 2022Updated 3 years ago
- ☆10May 12, 2022Updated 4 years ago
- Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.☆13Mar 20, 2025Updated last year
- diffusers with search engine☆12Jan 13, 2026Updated 5 months ago
- Fast OS-level support for GPU checkpoint and restore☆283Sep 28, 2025Updated 8 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Framework for Algorithmic Correctness Testing of Operators☆16Mar 9, 2026Updated 3 months ago
- Measures the conformance of a BPF runtime to the ISA.☆38Jun 6, 2026Updated last week
- A WebAssembly eBPF runtime based on wasmtime in rust☆11Feb 20, 2023Updated 3 years ago
- Orchestration and memory for multi-agent systems☆16Jun 8, 2026Updated last week
- Artifact for 'Register Optimizations for Stencils on GPUs'☆10Sep 18, 2018Updated 7 years ago
- naïve blockchain in Rust☆10Nov 13, 2020Updated 5 years ago
- ☆18Feb 16, 2024Updated 2 years ago
- ☆12Mar 7, 2024Updated 2 years ago
- Simulate and Render MuJoCo in the Browser with 3DGS.☆49Apr 16, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆129Jul 13, 2024Updated last year
- This is a game interface called the doudizhu by Qt,and I only imitated the interface simply.The object has thr function of random license…☆12Sep 6, 2018Updated 7 years ago
- Userspace eBPF Runtime Benchmarking Test Suite and Results☆17Jun 10, 2026Updated last week
- Training and evaluation of a texture representation network.☆11Apr 22, 2020Updated 6 years ago
- LLVM passes and IR generators code examples☆15Feb 12, 2026Updated 4 months ago
- Notes on optimizing the linux kernel function csum_partial☆14Nov 28, 2021Updated 4 years ago
- Causal Analysis of Agent Behavior for AI Safety☆20Jun 27, 2023Updated 2 years ago
- ☆19Nov 11, 2025Updated 7 months ago
- Multi-Spectral Gaussian Splatting with Neural Color Representation☆31May 6, 2026Updated last month
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- LoRAFusion: Efficient LoRA Fine-Tuning for LLMs☆28Apr 8, 2026Updated 2 months ago
- Keystroke dynamics refers to the automated method of identifying or confirming the identity of an individual based on the manner and the …☆20Jan 28, 2020Updated 6 years ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆105Apr 7, 2026Updated 2 months ago
- ☆12Feb 24, 2023Updated 3 years ago
- ☆12Jan 19, 2020Updated 6 years ago
- ☆16Sep 26, 2024Updated last year
- Rohbau3D: A Shell Construction Site 3D Point Cloud Dataset☆38May 18, 2026Updated last month