msaroufim / awesome-profilingLinks
Awesome utilities for performance profiling
☆185Updated 4 months ago
Alternatives and similar repositories for awesome-profiling
Users that are interested in awesome-profiling are comparing it to the libraries listed below
Sorting:
- Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the…☆326Updated this week
- Lightweight daemon for monitoring CUDA runtime API calls with eBPF uprobes☆117Updated 4 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆43Updated 4 months ago
- Machine Learning Framework for Operating Systems - Brings ML to Linux kernel☆249Updated 3 years ago
- CUDA checkpoint and restore utility☆353Updated 6 months ago
- AI/GPU flame graph☆178Updated last week
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- MLIR-based partitioning system☆114Updated this week
- TORCH_LOGS parser for PT2☆47Updated last week
- A library to analyze PyTorch traces.☆400Updated this week
- High-performance safetensors model loader☆52Updated 2 weeks ago
- Awesome resources for GPUs☆577Updated 2 years ago
- GPUOcelot: A dynamic compilation framework for PTX☆204Updated 5 months ago
- TritonParse: A Compiler Tracer, Visualizer, and mini-Reproducer(WIP) for Triton Kernels☆138Updated this week
- MLPerf™ logging library☆37Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆345Updated this week
- Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake usin…☆26Updated 4 months ago
- ☆38Updated this week
- Memory Optimizations for Deep Learning (ICML 2023)☆102Updated last year
- An I/O benchmark for deep Learning applications☆89Updated last month
- Benchmarks to capture important workloads.☆31Updated 6 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆282Updated 3 weeks ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆147Updated last week
- Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research☆109Updated last year
- PyTorch RFCs (experimental)☆133Updated 2 months ago
- DCPerf benchmark suite for hyperscale cloud applications☆196Updated this week
- Python bindings for UCX☆137Updated last week
- A plugin for Jupyter Notebook to run CUDA C/C++ code☆238Updated 10 months ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆200Updated this week
- LLM training in simple, raw C/CUDA☆102Updated last year