ryantd / veloceLinks

WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.

☆17

Alternatives and similar repositories for veloce

Users that are interested in veloce are comparing it to the libraries listed below

Sorting:

zhisbug / ray-scalable-ml-design
Some microbenchmarks and design docs before commencement
☆12Updated 4 years ago
ray-project / ray_shuffling_data_loader
A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…
☆18Updated 2 years ago
tensorchord / inference-benchmark
Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
☆28Updated 2 years ago
ray-project / distml
Distributed ML Optimizer
☆34Updated 4 years ago
rapidsai / ucx-py
Python bindings for UCX
☆140Updated 2 months ago
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆38Updated 7 months ago
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
ucbrise / hypersched
Deadline-based hyperparameter tuning on RayTune.
☆31Updated 5 years ago
simon-mo / vLLM-Benchmark
☆31Updated 7 months ago
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆114Updated 2 years ago
danyangz / lightning
Lightning In-Memory Object Store
☆47Updated 3 years ago
octoml / synr
A library for syntactically rewriting Python programs, pronounced (sinner).
☆68Updated 3 years ago
hpcaitech / CachedEmbedding
A memory efficient DLRM training solution using ColossalAI
☆106Updated 3 years ago
smartnets / dataloader-benchmarks
DL Dataloader Benchmarks
☆20Updated 10 months ago
HabanaAI / Megatron-DeepSpeed
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆15Updated 11 months ago
meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated 2 months ago
mlcommons / logging
MLPerf™ logging library
☆37Updated last month
deepspeedai / DeepSpeed-Kernels
☆71Updated 8 months ago
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
intel / llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆130Updated 2 months ago
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆122Updated last year
triton-inference-server / fil_backend
FIL backend for the Triton Inference Server
☆83Updated this week
rapidsai / wholegraph
WholeGraph - large scale Graph Neural Networks
☆105Updated last year
ray-project / ray_beam_runner
Ray-based Apache Beam runner
☆42Updated 2 years ago
chips-compilers-mlsys-21 / chips-compilers-mlsys-21.github.io
☆11Updated 4 years ago
anyscale / llm-continuous-batching-benchmarks
☆122Updated last year
nums-project / nums
A library that translates Python and NumPy to optimized distributed systems code.
☆131Updated 3 years ago
Qihoo360 / dgl-operator
The DGL Operator makes it easy to run Deep Graph Library (DGL) graph neural network training on Kubernetes
☆44Updated 4 years ago
UmerHA / triton_util
Make triton easier
☆49Updated last year
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆134Updated 5 months ago