NVIDIA / multi-storage-clientLinks
Unified high-performance Python client for object and file stores.
☆28Updated last week
Alternatives and similar repositories for multi-storage-client
Users that are interested in multi-storage-client are comparing it to the libraries listed below
Sorting:
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆335Updated this week
- AIStore: scalable storage for AI applications☆1,519Updated this week
- Container plugin for Slurm Workload Manager☆344Updated 7 months ago
- Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimenta…☆499Updated 2 weeks ago
- NVIDIA Inference Xfer Library (NIXL)☆387Updated this week
- ☆49Updated 3 months ago
- Tools to deploy GPU clusters in the Cloud☆31Updated 2 years ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆173Updated this week
- ☆138Updated 2 weeks ago
- PyTorch per step fault tolerance (actively under development)☆302Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆75Updated this week
- KvikIO - High Performance File IO☆210Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆60Updated 2 months ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆366Updated 2 weeks ago
- Module, Model, and Tensor Serialization/Deserialization☆234Updated last week
- A tool to configure, launch and manage your machine learning experiments.☆153Updated this week
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆182Updated last week
- ☆215Updated last week
- Experimental projects related to TensorRT☆105Updated last week
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆176Updated this week
- This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.☆17Updated 4 months ago
- ☆62Updated 3 months ago
- JAX-Toolbox☆308Updated this week
- Scalable data pre processing and curation toolkit for LLMs☆933Updated this week
- Provides for deploying custom ETL containers on AIStore, with subsequent user-defined extraction-transformation-loading in parallel, on t…☆16Updated this week
- Pipeline Parallelism for PyTorch☆767Updated 9 months ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆123Updated this week
- Scalable and Performant Data Loading☆269Updated last week
- A tensor-aware point-to-point communication primitive for machine learning☆257Updated 2 years ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆186Updated this week