facebookresearch / paramLinks

PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.

☆153

Alternatives and similar repositories for param

Users that are interested in param are comparing it to the libraries listed below

Sorting:

Azure / msccl
Microsoft Collective Communication Library
☆66Updated last year
microsoft / NPKit
NCCL Profiling Kit
☆149Updated last year
google / nccl-fastsocket
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.
☆122Updated 2 years ago
Mellanox / nccl-rdma-sharp-plugins
RDMA and SHARP plugins for nccl library
☆215Updated 2 weeks ago
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆121Updated last year
aws / aws-ofi-nccl
This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
☆198Updated last week
microsoft / msccl
Microsoft Collective Communication Library
☆375Updated 2 years ago
parasailteam / coconet
☆83Updated 3 years ago
mcrl / tccl
Thunder Research Group's Collective Communication Library
☆43Updated 4 months ago
SymbioticLab / Salus
Fine-grained GPU sharing primitives
☆147Updated 4 months ago
microsoft / mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆439Updated this week
mlcommons / chakra
Repository for MLCommons Chakra schema and tools
☆142Updated last month
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆239Updated this week
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆91Updated 2 years ago
microsoft / taccl
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
☆78Updated 2 years ago
facebookresearch / torch_ucc
Pytorch process group third-party plugin for UCC
☆21Updated last year
mlcommons / chakra-old
Repository for MLCommons Chakra schema and tools
☆39Updated last year
openucx / ucc
Unified Collective Communication Library
☆280Updated this week
awslabs / slapo
A schedule language for large model training
☆151Updated 3 months ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
eniac / paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆65Updated last year
ParCoreLab / Snoopie
Multi-GPU communication profiler and visualizer
☆36Updated last year
NVIDIA / cloudai
CloudAI Benchmark Framework
☆75Updated this week
uxlfoundation / oneCCL
oneAPI Collective Communications Library (oneCCL)
☆248Updated this week
eth-easl / orion
An interference-aware scheduler for fine-grained GPU sharing
☆153Updated last week
microsoft / SuperScaler
An experimental parallel training platform
☆56Updated last year
geoffxy / habitat
🔮 Execution time predictions for deep neural network training iterations across different GPUs.
☆63Updated 3 years ago
ROCm / rccl
ROCm Communication Collectives Library (RCCL)
☆403Updated last week
SymbioticLab / Oobleck
A resilient distributed training framework
☆96Updated last year
NVIDIA / MagnumIO
Magnum IO community repo
☆104Updated 3 months ago