pytorch / test-infraLinks
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic to track disabled tests and slow tests, as well as our continuation integration jobs HUD/dashboard.
☆100Updated this week
Alternatives and similar repositories for test-infra
Users that are interested in test-infra are comparing it to the libraries listed below
Sorting:
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆161Updated 2 months ago
- PyTorch RFCs (experimental)☆135Updated 3 months ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆387Updated last week
- A library to analyze PyTorch traces.☆406Updated 3 weeks ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆296Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆351Updated last week
- A tensor-aware point-to-point communication primitive for machine learning☆265Updated 3 weeks ago
- TorchFix - a linter for PyTorch-using code with autofix support☆147Updated 3 weeks ago
- ☆176Updated last year
- Provide Python access to the NVML library for GPU diagnostics☆245Updated last week
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- Torch Distributed Experimental☆117Updated last year
- ☆251Updated last year
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆180Updated 2 weeks ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆289Updated last week
- A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.☆861Updated last week
- ☆330Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆216Updated this week
- ☆146Updated last month
- jax-triton contains integrations between JAX and OpenAI Triton☆416Updated last week
- TORCH_LOGS parser for PT2☆59Updated this week
- Distributed preprocessing and data loading for language datasets☆39Updated last year
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆395Updated 2 weeks ago
- Pipeline Parallelism for PyTorch☆779Updated last year
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆194Updated this week
- PyTorch interface for the IPU☆181Updated last year
- Implementation of a Transformer, but completely in Triton☆274Updated 3 years ago
- PyTorch centric eager mode debugger☆48Updated 8 months ago
- This repository contains the experimental PyTorch native float8 training UX☆224Updated last year
- oneCCL Bindings for Pytorch*☆102Updated last month