UniHD-CEG / cuda-memtrace
LLVM Plugin to Instrument Global Memory Accesses in CUDA Kernels
☆10Updated 4 years ago
Alternatives and similar repositories for cuda-memtrace:
Users that are interested in cuda-memtrace are comparing it to the libraries listed below
- ☆31Updated last year
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆28Updated 6 months ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- An LLVM pass to profile dynamic LLVM IR instructions and runtime values☆138Updated 4 years ago
- ☆59Updated 5 months ago
- Updated C version of the Test Suite for Vectorising Compilers☆57Updated last year
- ☆33Updated 3 years ago
- ☆37Updated 3 months ago
- Tools to track memory accesses in applications and visualize the patterns to reveal opportunities for optimization.☆92Updated 9 years ago
- a Pin tool for collecting microarchitecture-independent workload characteristics☆60Updated last year
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆14Updated 2 years ago
- Interprocedural Basic Block Code Layout Optimization☆18Updated 6 years ago
- PIN-tool to produce multi-threaded atomic memory traces☆36Updated 11 years ago
- Repeated access to L2-containable loops to look for snoop filter conflicts on Intel Skylake Xeon processors.☆29Updated 6 years ago
- GPUReplay, ASPLOS 2022☆35Updated 3 years ago
- Haystack is an analytical cache model that given a program computes the number of cache misses.☆46Updated 5 years ago
- CERE: Codelet Extractor and REplayer☆40Updated last year
- ☆34Updated 3 years ago
- ☆51Updated 5 years ago
- Polyhedral Extraction Tool (source repository: http://repo.or.cz/w/pet.git)☆39Updated 2 years ago
- Race detector for NVIDIA GPUs, published in SOSP 2021.☆18Updated last month
- A dynamic analysis tool to detect floating-point errors in HPC applications.☆33Updated last week
- ☆28Updated 2 years ago
- ☆15Updated 6 years ago
- Code released to accompany the ISCA paper: "T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware"☆28Updated 3 years ago
- Chunky Loop Analyzer: A Polyhedral Representation Extraction Tool for High Level Programs☆24Updated 2 years ago
- Clang-based translator for OP2☆11Updated 2 years ago
- ☆69Updated 4 years ago
- ☆40Updated this week
- Program analysis tool based on software performance counters☆56Updated 3 years ago