pytorch/kineto

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pytorch/kineto)

pytorch / kineto

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

☆974

Alternatives and similar repositories for kineto

Users that are interested in kineto are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / HolisticTraceAnalysis
View on GitHub
A library to analyze PyTorch traces.
☆535May 29, 2026Updated last month
pytorch / torchdynamo
View on GitHub
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
☆1,078Apr 17, 2024Updated 2 years ago
pytorch / benchmark
View on GitHub
TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
☆1,042Updated this week
pytorch / FBGEMM
View on GitHub
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
☆1,570Updated this week
pytorch / torchdistx
View on GitHub
Torch Distributed Experimental
☆117Aug 5, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,410Apr 26, 2025Updated last year
facebookincubator / dynolog
View on GitHub
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the…
☆375Updated this week
pytorch / PiPPy
View on GitHub
Pipeline Parallelism for PyTorch
☆786Aug 21, 2024Updated last year
meta-pytorch / torchsnapshot
View on GitHub
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆165Jun 10, 2026Updated last month
pytorch / builder
View on GitHub
Continuous builder and binary build scripts for pytorch
☆355Aug 12, 2025Updated 11 months ago
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,434Updated this week
NVIDIA / nccl-tests
View on GitHub
NCCL Tests
☆1,595Jul 9, 2026Updated last week
thuml / depyf
View on GitHub
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
☆815Oct 13, 2025Updated 9 months ago
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,725Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
microsoft / msccl
View on GitHub
Microsoft Collective Communication Library
☆394Sep 20, 2023Updated 2 years ago
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,104Updated this week
pytorch / functorch
View on GitHub
functorch is JAX-like composable function transforms for PyTorch.
☆1,434Aug 21, 2025Updated 10 months ago
facebookresearch / param
View on GitHub
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…
☆155Jul 2, 2026Updated 2 weeks ago
meta-pytorch / torchx
View on GitHub
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…
☆427Updated this week
NVIDIA / PyProf
View on GitHub
A GPU performance profiling tool for PyTorch models
☆510Jul 13, 2021Updated 5 years ago
pytorch / rfcs
View on GitHub
PyTorch RFCs (experimental)
☆147Updated this week
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,108Updated this week
NVIDIA / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆6,439Mar 27, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
meta-pytorch / float8_experimental
View on GitHub
This repository contains the experimental PyTorch native float8 training UX
☆226Aug 1, 2024Updated last year
pytorch / tensorpipe
View on GitHub
A tensor-aware point-to-point communication primitive for machine learning
☆286Dec 17, 2025Updated 7 months ago
pytorch / ao
View on GitHub
PyTorch native quantization and sparsity for training and inference
☆2,906Updated this week
meta-pytorch / applied-ai
View on GitHub
Applied AI experiments and examples for PyTorch
☆322Aug 22, 2025Updated 10 months ago
facebookresearch / fairring
View on GitHub
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆66Mar 21, 2022Updated 4 years ago
NVIDIA / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆4,892Updated this week
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,983Updated this week
alpa-projects / alpa
View on GitHub
Training and serving large-scale neural networks with auto parallelization.
☆3,178Dec 9, 2023Updated 2 years ago
pytorch / torchtitan
View on GitHub
A PyTorch native platform for training generative AI models
☆5,541Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
volcengine / veScale
View on GitHub
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
☆1,031Mar 3, 2026Updated 4 months ago
meta-pytorch / tritonbench
View on GitHub
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆361Updated this week
microsoft / NPKit
View on GitHub
NCCL Profiling Kit
☆155Jul 1, 2024Updated 2 years ago
pytorch / test-infra
View on GitHub
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …
☆110Updated this week
facebookresearch / xformers
View on GitHub
Hackable and optimized Transformers building blocks, supporting a composable construction.
☆10,523Updated this week
pytorch / gloo
View on GitHub
Collective communications library with various primitives for multi-machine training.
☆1,437Jul 1, 2026Updated 2 weeks ago
NVIDIA / apex
View on GitHub
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
☆8,982Jul 13, 2026Updated last week