pytorch / test-infraLinks
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic to track disabled tests and slow tests, as well as our continuation integration jobs HUD/dashboard.
☆104Updated this week
Alternatives and similar repositories for test-infra
Users that are interested in test-infra are comparing it to the libraries listed below
Sorting:
- PyTorch RFCs (experimental)☆138Updated 8 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆164Updated 2 weeks ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆411Updated this week
- A library to analyze PyTorch traces.☆460Updated last week
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆375Updated this week
- Torch Distributed Experimental☆117Updated last year
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆327Updated 3 weeks ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆253Updated last week
- A tensor-aware point-to-point communication primitive for machine learning☆283Updated last month
- ☆344Updated 3 weeks ago
- The Triton backend for the PyTorch TorchScript models.☆171Updated 2 weeks ago
- ☆151Updated 3 weeks ago
- TORCH_TRACE parser for PT2☆72Updated last week
- ☆186Updated last year
- ☆252Updated last year
- Provide Python access to the NVML library for GPU diagnostics☆258Updated 4 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆472Updated 2 weeks ago
- TorchFix - a linter for PyTorch-using code with autofix support☆152Updated 5 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆218Updated last week
- oneCCL Bindings for Pytorch* (deprecated)☆104Updated 3 weeks ago
- Implementation of a Transformer, but completely in Triton☆279Updated 3 years ago
- MLPerf™ logging library☆38Updated last month
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆205Updated this week
- An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.☆51Updated last week
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆182Updated last month
- Distributed preprocessing and data loading for language datasets☆40Updated last year
- extensible collectives library in triton☆93Updated 9 months ago
- Tokamax: A GPU and TPU kernel library.☆165Updated this week
- A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.☆919Updated this week