project-codeflare / zero-copy-model-loadingLinks

In-depth code associated with my Medium blog post, "How to Load PyTorch Models 340 Times Faster with Ray"

☆28

Alternatives and similar repositories for zero-copy-model-loading

Users that are interested in zero-copy-model-loading are comparing it to the libraries listed below

Sorting:

bentoml / simple_di
Simple dependency injection framework for Python
☆21Updated last year
foundation-model-stack / fastsafetensors
High-performance safetensors model loader
☆40Updated this week
ray-project / pygloo
Pygloo provides Python bindings for Gloo.
☆21Updated 4 months ago
ryantd / veloce
WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.
☆18Updated 2 years ago
bentoml / sentence-embedding-bento
Sentence Embedding as a Service
☆15Updated last year
ray-project / distml
Distributed ML Optimizer
☆32Updated 3 years ago
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆31Updated 2 months ago
sdpython / mlprodict
Productionize machine learning predictions, with ONNX or without
☆65Updated last year
coreweave / tensorizer
Module, Model, and Tensor Serialization/Deserialization
☆241Updated 2 weeks ago
zhisbug / ray-scalable-ml-design
Some microbenchmarks and design docs before commencement
☆12Updated 4 years ago
ray-project / mlflow-ray-serve
MLFlow Deployment Plugin for Ray Serve
☆45Updated 3 years ago
hora-search / horapy
🐍 Python bidding for the Hora Approximate Nearest Neighbor Search Algorithm library
☆72Updated 3 years ago
pytorch-labs / torchfix
TorchFix - a linter for PyTorch-using code with autofix support
☆143Updated 4 months ago
onnx / steering-committee
Notes and artifacts from the ONNX steering committee
☆26Updated this week
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆56Updated last week
pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆157Updated this week
msaroufim / torchprep
☆13Updated 2 years ago
triton-inference-server / fil_backend
FIL backend for the Triton Inference Server
☆81Updated 3 weeks ago
nod-ai / transformer-benchmarks
benchmarking some transformer deployments
☆26Updated 2 years ago
ray-project / ray_shuffling_data_loader
A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…
☆18Updated 2 years ago
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆113Updated 2 years ago
smartnets / dataloader-benchmarks
DL Dataloader Benchmarks
☆18Updated 5 months ago
bentoml / IF-multi-GPUs-demo
☆13Updated last year
coreweave / ml-containers
☆37Updated this week
hamelsmu / llama-inference
experiments with inference on llama
☆104Updated last year
rapidsai / ucx-py
Python bindings for UCX
☆137Updated this week
nv-legate / legate.pandas
An Aspiring Drop-In Replacement for Pandas at Scale
☆73Updated 3 years ago
danyangz / lightning
Lightning In-Memory Object Store
☆46Updated 3 years ago
NVIDIA / free-threaded-python
No-GIL Python environment featuring NVIDIA Deep Learning libraries.
☆61Updated 2 months ago
tensorchord / inference-benchmark
Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
☆28Updated 2 years ago