microsoft / varunaLinks

☆252

Alternatives and similar repositories for varuna

Users that are interested in varuna are comparing it to the libraries listed below

Sorting:

meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated last month
spcl / substation
Research and development for optimizing transformers
☆131Updated 4 years ago
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆276Updated 3 years ago
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆215Updated last week
facebookresearch / HolisticTraceAnalysis
A library to analyze PyTorch traces.
☆419Updated 2 weeks ago
stanford-futuredata / stk
☆112Updated last year
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆120Updated 11 months ago
anyscale / llm-continuous-batching-benchmarks
☆121Updated last year
awslabs / slapo
A schedule language for large model training
☆151Updated 2 months ago
RulinShao / LightSeq
Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training
☆216Updated last year
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆274Updated 2 months ago
meta-pytorch / torchx
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…
☆395Updated 2 weeks ago
meta-pytorch / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆223Updated last year
zhuohan123 / terapipe
☆75Updated 4 years ago
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆136Updated 3 years ago
DS3Lab / DT-FM
☆93Updated 3 years ago
shawntan / scattermoe
Triton-based implementation of Sparse Mixture of Experts.
☆246Updated 3 weeks ago
pytorch / PiPPy
Pipeline Parallelism for PyTorch
☆780Updated last year
cli99 / llm-analysis
Latency and Memory Analysis of Transformer Models for Training and Inference
☆460Updated 6 months ago
pytorch / rfcs
PyTorch RFCs (experimental)
☆135Updated 5 months ago
cchan / tccl
extensible collectives library in triton
☆90Updated 7 months ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
google / aqt
☆335Updated last month
meta-pytorch / applied-ai
Applied AI experiments and examples for PyTorch
☆301Updated 2 months ago
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆228Updated last week
xuqifan897 / Optimus
☆28Updated 4 years ago
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
yandex-research / DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
☆117Updated 3 years ago