Xuhpclab / DrCCTProfLinks

DrCCTProf is a fine-grained call path profiling framework for binaries running on ARM and X86 architectures.

☆122

Alternatives and similar repositories for DrCCTProf

Users that are interested in DrCCTProf are comparing it to the libraries listed below

Sorting:

Xuhpclab / jxperf
☆11Updated 3 years ago
ScalableMachinesResearch / JXPerf
Java inefficiency detection tool based on CPU performance monitoring counters and hardware debug register. The tool detects dead writes, …
☆45Updated 3 years ago
Ptolemy-DL / Ptolemy
☆96Updated 4 years ago
CCTLib / cctlib
☆34Updated 3 years ago
clevercool / TileSparsity
☆106Updated 4 years ago
xiexi51 / MaxK-GNN
Official implementation of "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training"
☆38Updated last year
intel / memory-bandwidth-benchmarks
Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's
☆90Updated last year
xiexi51 / ICCAD-Accel-GCN
Official Implementation of "Accel-GNN: High-Performance GPU Accelerator Design for Graph Neural Networks"
☆50Updated 3 months ago
WUSTL-CSPL / Kairos-Userspace
☆17Updated 11 months ago
sderek / CUDAAdvisor
CUDAAdvisor: a GPU profiling tool
☆49Updated 6 years ago
GVProf / GVProf
GVProf: A Value Profiler for GPU-based Clusters
☆50Updated last year
ithemal / bhive
☆38Updated 3 years ago
csl-iisc / iGUARD-SOSP21
Race detector for NVIDIA GPUs, published in SOSP 2021.
☆18Updated 4 months ago
Qcompiler / MixQ_Tensorrt_LLM
Mixed precision inference by Tensorrt-LLM
☆80Updated 8 months ago
SabaJamilan / Profile-Guided-Software-Prefetching
☆23Updated 2 years ago
SamAinsworth / reproduce-cgo2017-paper
Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.
☆38Updated 3 years ago
Qcompiler / MIXQ
MIXQ: Taming Dynamic Outliers in Mixed-Precision Quantization by Online Prediction
☆90Updated 8 months ago
jdmccalpin / periodic-performance-counters
A low-overhead tool to periodically collect system-wide hardware performance counters on Intel64 systems.
☆32Updated 2 years ago
gongbell / CUDAsmith
A CUDA compiler fuzzer
☆25Updated last year
upenn-acg / ocolos-public
Ocolos is the first online code layout optimization system for unmodified applications written in unmanaged languages.
☆52Updated this week
srvm / cupti_profiler
CUPTI GPU Profiler
☆38Updated 6 years ago
ARM-software / synchronization-benchmarks
Collection of synchronization micro-benchmarks and traces from infrastructure applications
☆44Updated 2 weeks ago
intel / DML
Intel® Data Mover Library (Intel® DML)
☆95Updated 3 months ago
DependableSystemsLab / LLFI
LLFI is an LLVM based fault injection tool, that injects faults into the LLVM IR of the application source code. The faults can be injec…
☆73Updated 2 years ago
stevenpelley / atomic-memory-trace
PIN-tool to produce multi-threaded atomic memory traces
☆36Updated 11 years ago
wcohen / libpfm4
This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…
☆65Updated 8 months ago
Lin-Mao / DrGPUM
A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
☆25Updated 8 months ago
thu-pacman / Spindle
☆32Updated 2 years ago
google / multichase
☆132Updated 3 weeks ago
CGCL-codes / YiTu
YiTu is an easy-to-use runtime to fully exploit the hybrid parallelism of different hardwares (e.g., GPU) to efficiently support the exec…
☆258Updated 2 months ago