msaroufim / awesome-profilingLinks
Awesome utilities for performance profiling
☆171Updated 3 months ago
Alternatives and similar repositories for awesome-profiling
Users that are interested in awesome-profiling are comparing it to the libraries listed below
Sorting:
- Awesome resources for GPUs☆572Updated last year
- Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the…☆314Updated last week
- MLIR-based partitioning system☆86Updated this week
- A library to analyze PyTorch traces.☆384Updated last week
- Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research☆109Updated last year
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆157Updated 6 months ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆153Updated this week
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆179Updated 5 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆43Updated 2 months ago
- CUDA checkpoint and restore utility☆341Updated 4 months ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆131Updated 3 years ago
- Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient 🚀☆116Updated last year
- Benchmarks to capture important workloads.☆31Updated 4 months ago
- Dias: Dynamic Rewriting of Pandas Code☆72Updated 3 weeks ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆268Updated last week
- The missing pieces (as far as boilerplate reduction goes) of the upstream MLIR python bindings.☆99Updated last week
- An interactive web-based tool for exploring intermediate representations of PyTorch and Triton models☆46Updated this week
- A Data-Centric Compiler for Machine Learning☆83Updated last year
- End to End steps for adding custom ops in PyTorch.☆23Updated 4 years ago
- An IR for efficiently simulating distributed ML computation.☆28Updated last year
- Reference Kernels for the Leaderboard☆55Updated this week
- GPUOcelot: A dynamic compilation framework for PTX☆192Updated 3 months ago
- High-Performance SGEMM on CUDA devices☆94Updated 4 months ago
- AI/GPU flame graph☆150Updated last week
- ☆150Updated last week
- An open-source efficient deep learning framework/compiler, written in python.☆700Updated this week
- ☆416Updated this week
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆92Updated this week
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆133Updated last year
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago