NVIDIA / multi-storage-clientLinks
Unified high-performance Python client for object and file stores.
☆57Updated 2 weeks ago
Alternatives and similar repositories for multi-storage-client
Users that are interested in multi-storage-client are comparing it to the libraries listed below
Sorting:
- A tool to configure, launch and manage your machine learning experiments.☆216Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆474Updated 3 weeks ago
- Scalable and Performant Data Loading☆364Updated this week
- Load compute kernels from the Hub☆389Updated last week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆279Updated 2 months ago
- Speed up model training by fixing data loading.☆575Updated this week
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆412Updated this week
- Megatron's multi-modal data loader☆315Updated last week
- Where GPUs get cooked 👩🍳🔥☆362Updated 2 weeks ago
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support☆266Updated this week
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆219Updated this week
- This repository contains the experimental PyTorch native float8 training UX☆227Updated last year
- A library to analyze PyTorch traces.☆462Updated this week
- Pipeline Parallelism for PyTorch☆784Updated last year
- PyTorch Single Controller☆957Updated this week
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆739Updated this week
- Container plugin for Slurm Workload Manager☆412Updated 3 weeks ago
- The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.☆203Updated this week
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆164Updated 3 weeks ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆255Updated this week
- JAX-Toolbox☆382Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆404Updated last month
- Module, Model, and Tensor Serialization/Deserialization☆286Updated 5 months ago
- 👷 Build compute kernels☆214Updated last week
- TorchFix - a linter for PyTorch-using code with autofix support☆152Updated 5 months ago
- A Quirky Assortment of CuTe Kernels☆781Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆79Updated last month
- KvikIO - High Performance File IO☆238Updated last week
- The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such as…☆18Updated 4 months ago
- ☆345Updated last week