Convert nvprof profiles into about:tracing compatible JSON files
☆73Apr 9, 2021Updated 4 years ago
Alternatives and similar repositories for nvprof2json
Users that are interested in nvprof2json are comparing it to the libraries listed below
Sorting:
- A Python script to convert the output of NVIDIA Nsight Systems (in SQLite format) to JSON in Google Chrome Trace Event Format.☆55Aug 5, 2025Updated 7 months ago
- Prototype routines for GPU quantization written using PyTorch.☆21Feb 8, 2026Updated 3 weeks ago
- Python tools for NVIDIA Profiler☆21Dec 21, 2017Updated 8 years ago
- ☆21Mar 3, 2025Updated last year
- Torch FFI-bindings for NNPACK☆31May 26, 2017Updated 8 years ago
- Strassen's Algorithm for Tensor Contraction☆14Jul 7, 2017Updated 8 years ago
- A tracing JIT compiler for PyTorch☆13Dec 11, 2021Updated 4 years ago
- Hacks for PyTorch☆19Apr 18, 2023Updated 2 years ago
- ☆20Nov 23, 2022Updated 3 years ago
- TORCH_TRACE parser for PT2☆78Updated this week
- Mixed precision training from scratch with Tensors and CUDA☆28May 14, 2024Updated last year
- A GPU cache model for research purposes☆28Nov 4, 2013Updated 12 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆154Jan 21, 2026Updated last month
- Hugging Face Download (Cache) Manager☆22Aug 7, 2022Updated 3 years ago
- Playground for some RNN stuff in Torch.☆21Aug 12, 2015Updated 10 years ago
- A Toy-Purpose TPU Simulator☆22Jun 7, 2024Updated last year
- ☆28Jan 17, 2025Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Feb 22, 2026Updated last week
- ☆27Oct 26, 2019Updated 6 years ago
- Malmo Collaborative AI Challenge - Team Pig Catcher☆66May 22, 2017Updated 8 years ago
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆122Nov 15, 2023Updated 2 years ago
- This is a python library intended for management and storage of Frequency Response Functions and Complex Frequency Domain Assurance Crite…☆10Nov 16, 2023Updated 2 years ago
- Repository for go shared libraries (for now).☆11Dec 1, 2025Updated 3 months ago
- Code for reproducing work of ICML 2019 paper: Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Appli…☆12Jun 8, 2019Updated 6 years ago
- A complete pipeline for fine-tuning YOLOv8 pose models with custom datasets. Supports automatic and semi-automatic annotation for efficie…☆15Feb 9, 2025Updated last year
- Personal summaries of deep learning and AI papers☆31Jan 10, 2021Updated 5 years ago
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆63Dec 19, 2025Updated 2 months ago
- PyTorch RFCs (experimental)☆139May 26, 2025Updated 9 months ago
- MaskedTensors for PyTorch☆39Jul 17, 2022Updated 3 years ago
- The Radeon Compute Profiler (RCP) is a performance analysis tool that gathers data from the API run-time and GPU for OpenCL™ and ROCm/HSA…☆85Jun 16, 2020Updated 5 years ago
- This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.☆36Feb 23, 2024Updated 2 years ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- I heard you like compile times☆42Feb 8, 2020Updated 6 years ago
- ☆12Oct 19, 2014Updated 11 years ago
- Slimebound character mod for Slay the Spire☆14Jun 30, 2020Updated 5 years ago
- ☆10Oct 27, 2023Updated 2 years ago
- Intel(R) Distribution for GDB*☆15Jan 26, 2026Updated last month
- Extended globbing in modern C++☆12Dec 24, 2025Updated 2 months ago