iarai / concurrent-dataloaderLinks

Profiling and Improving the PyTorch Dataloader for high-latency Storage

☆20

Alternatives and similar repositories for concurrent-dataloader

Users that are interested in concurrent-dataloader are comparing it to the libraries listed below

Sorting:

Harry-Chen / InfMoE
Inference framework for MoE layers based on TensorRT with Python binding
☆41Updated 4 years ago
spcl / substation
Research and development for optimizing transformers
☆129Updated 4 years ago
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆117Updated 7 months ago
pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆158Updated last month
guanh01 / CS692-mlsys
This is the (evolving) reading list for the seminar.
☆59Updated 4 years ago
ray-project / distml
Distributed ML Optimizer
☆32Updated 3 years ago
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
eedalong / Dpex
Distributed DataLoader For Pytorch Based On Ray
☆24Updated 3 years ago
hpcaitech / CachedEmbedding
A memory efficient DLRM training solution using ColossalAI
☆105Updated 2 years ago
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆133Updated last month
cli99 / flops-profiler
pytorch-profiler
☆51Updated 2 years ago
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆132Updated 3 years ago
cake-lab / perseus
☆10Updated 2 years ago
apd10 / universal_memory_allocation
☆15Updated 3 years ago
smartnets / dataloader-benchmarks
DL Dataloader Benchmarks
☆19Updated 5 months ago
tensorchord / inference-benchmark
Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
☆28Updated 2 years ago
mlcommons / training_results_v1.0
This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
☆38Updated last year
stanford-futuredata / stk
☆107Updated 11 months ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆41Updated 2 years ago
qhliu26 / Dive-into-Big-Model-Training
📑 Dive into Big Model Training
☆115Updated 2 years ago
ModelTC / awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
☆57Updated last year
pytorch / multipy
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…
☆180Updated 2 weeks ago
Aleph-Alpha-Research / NeurIPS-WANT-submission-efficient-parallelization-layouts
☆22Updated last year
facebookresearch / MODel_opt
Memory Optimizations for Deep Learning (ICML 2023)
☆102Updated last year
msr-fiddle / DS-Analyzer
☆38Updated 4 years ago
volcengine / veGiantModel
☆220Updated last year
uwsampl / dtr-prototype
Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616
☆132Updated 2 years ago
YuhanLiu11 / AutoFreeze
☆22Updated 4 years ago
feifeibear / PSTensor
PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.
☆10Updated 3 years ago