NVIDIA / LDDLLinks

Distributed preprocessing and data loading for language datasets

☆39

Alternatives and similar repositories for LDDL

Users that are interested in LDDL are comparing it to the libraries listed below

Sorting:

meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated 2 months ago
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆122Updated last year
deepspeedai / DeepSpeed-Kernels
☆71Updated 8 months ago
anyscale / llm-continuous-batching-benchmarks
☆122Updated last year
HabanaAI / Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
☆169Updated 2 months ago
triton-inference-server / pytorch_backend
The Triton backend for the PyTorch TorchScript models.
☆166Updated last week
stanford-futuredata / stk
☆113Updated last year
spcl / substation
Research and development for optimizing transformers
☆131Updated 4 years ago
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆217Updated last week
intel / torch-ccl
oneCCL Bindings for Pytorch* (deprecated)
☆103Updated last month
pytorch / rfcs
PyTorch RFCs (experimental)
☆136Updated 6 months ago
microsoft / varuna
☆252Updated last year
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
mgmalek / efficient_cross_entropy
☆121Updated last year
mlcommons / training_results_v1.0
This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
☆36Updated last year
cchan / tccl
extensible collectives library in triton
☆91Updated 8 months ago
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆277Updated 3 years ago
exists-forall / striped_attention
☆44Updated 2 years ago
meta-pytorch / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆226Updated last year
IST-DASLab / Sparse-Marlin
Boosting 4-bit inference kernels with 2:4 Sparsity
☆86Updated last year
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆63Updated 5 months ago
meta-pytorch / applied-ai
Applied AI experiments and examples for PyTorch
☆308Updated 3 months ago
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆114Updated 2 years ago
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆275Updated last month
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆239Updated this week
RulinShao / LightSeq
Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training
☆218Updated last year
graphcore / tutorials
Training material for IPU users: tutorials, feature examples, simple applications
☆87Updated 2 years ago
tgale96 / grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
☆132Updated 6 months ago
mlcommons / training_results_v0.7
This repository contains the results and code for the MLPerf™ Training v0.7 benchmark.
☆57Updated 2 years ago