GPUprobe / gpuprobe-daemon
Lightweight daemon for monitoring CUDA runtime API calls with eBPF uprobes
☆84Updated 2 weeks ago
Alternatives and similar repositories for gpuprobe-daemon:
Users that are interested in gpuprobe-daemon are comparing it to the libraries listed below
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆150Updated last year
- An efficient GPU resource sharing system with fine-grained control for Linux platforms.☆82Updated last year
- Fast OS-level support for GPU checkpoint and restore☆181Updated this week
- cricket is a virtualization solution for GPUs☆191Updated last month
- CUDA checkpoint and restore utility☆322Updated 2 months ago
- NCCL Profiling Kit☆129Updated 9 months ago
- Artifacts for our NSDI'23 paper TGS☆75Updated 10 months ago
- DCPerf benchmark suite for hyperscale cloud applications☆162Updated last week
- NVIDIA NCCL Tests for Distributed Training☆88Updated last week
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆106Updated this week
- KV cache store for distributed LLM inference☆136Updated 2 weeks ago
- qCUDA: GPGPU Virtualization at a New API Remoting Method with Para-virtualization☆121Updated 3 years ago
- A tool to detect infrastructure issues on cloud native AI systems☆30Updated 3 weeks ago
- ☆47Updated 7 months ago
- NVIDIA Inference Xfer Library (NIXL)☆255Updated this week
- Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the…☆310Updated last week
- Magnum IO community repo☆89Updated 2 months ago
- The criu-coordinator tool aims to enable checkpoint/restore support for distributed applications with CRIU.