pytorch / test-infraLinks
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic to track disabled tests and slow tests, as well as our continuation integration jobs HUD/dashboard.
☆96Updated this week
Alternatives and similar repositories for test-infra
Users that are interested in test-infra are comparing it to the libraries listed below
Sorting:
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆158Updated 3 weeks ago
- PyTorch RFCs (experimental)☆133Updated last month
- A library to analyze PyTorch traces.☆391Updated this week
- ☆142Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆280Updated this week
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆370Updated last week
- Torch Distributed Experimental☆116Updated 11 months ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆343Updated this week
- TORCH_LOGS parser for PT2☆46Updated last week
- TorchFix - a linter for PyTorch-using code with autofix support☆143Updated 5 months ago
- ☆169Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆205Updated this week
- ☆320Updated 2 weeks ago
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- Provide Python access to the NVML library for GPU diagnostics☆241Updated 7 months ago
- A tensor-aware point-to-point communication primitive for machine learning☆259Updated 2 years ago
- ☆186Updated last month
- jax-triton contains integrations between JAX and OpenAI Triton☆405Updated 3 weeks ago
- A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.☆829Updated last week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆187Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆359Updated 2 weeks ago
- ☆40Updated 7 months ago
- Applied AI experiments and examples for PyTorch☆281Updated last month
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆115Updated 2 weeks ago
- extensible collectives library in triton☆87Updated 3 months ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆190Updated this week
- This repository contains the experimental PyTorch native float8 training UX☆224Updated 11 months ago
- oneCCL Bindings for Pytorch*☆99Updated this week
- ☆22Updated this week
- Stores documents and resources used by the OpenXLA developer community☆126Updated 11 months ago