pytorch / test-infraLinks
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic to track disabled tests and slow tests, as well as our continuation integration jobs HUD/dashboard.
☆104Updated this week
Alternatives and similar repositories for test-infra
Users that are interested in test-infra are comparing it to the libraries listed below
Sorting:
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆162Updated 2 weeks ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆411Updated this week
- PyTorch RFCs (experimental)☆136Updated 7 months ago
- A library to analyze PyTorch traces.☆453Updated 3 weeks ago
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆323Updated this week
- Provide Python access to the NVML library for GPU diagnostics☆257Updated 4 months ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆368Updated this week
- A tensor-aware point-to-point communication primitive for machine learning☆282Updated 3 weeks ago
- TorchFix - a linter for PyTorch-using code with autofix support☆152Updated 4 months ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆243Updated 2 weeks ago
- ☆149Updated this week
- Torch Distributed Experimental☆117Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆218Updated 3 weeks ago
- ☆187Updated last year
- The Triton backend for the PyTorch TorchScript models.☆168Updated 2 weeks ago
- TORCH_LOGS parser for PT2☆70Updated last week
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆182Updated 3 weeks ago
- ☆341Updated this week
- Distributed preprocessing and data loading for language datasets☆40Updated last year
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆465Updated 2 weeks ago
- ☆252Updated last year
- This repository contains the experimental PyTorch native float8 training UX☆227Updated last year
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆706Updated this week
- A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.☆913Updated this week
- Implementation of a Transformer, but completely in Triton☆278Updated 3 years ago
- extensible collectives library in triton☆91Updated 9 months ago
- MLPerf™ logging library☆38Updated 3 weeks ago
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆398Updated 6 months ago
- Continuous builder and binary build scripts for pytorch☆356Updated 4 months ago