project-codeflare / zero-copy-model-loading
In-depth code associated with my Medium blog post, "How to Load PyTorch Models 340 Times Faster with Ray"
☆26Updated 2 years ago
Alternatives and similar repositories for zero-copy-model-loading:
Users that are interested in zero-copy-model-loading are comparing it to the libraries listed below
- ☆30Updated last week
- Simple dependency injection framework for Python☆20Updated 10 months ago
- MLFlow Deployment Plugin for Ray Serve☆44Updated 2 years ago
- Pygloo provides Python bindings for Gloo.☆21Updated 3 weeks ago
- Plugin for deploying MLflow models to TorchServe☆108Updated last year
- Distributed ML Optimizer☆30Updated 3 years ago
- A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-…☆67Updated last year
- Provide Python access to the NVML library for GPU diagnostics☆226Updated 3 months ago
- Module, Model, and Tensor Serialization/Deserialization☆217Updated last month
- Productionize machine learning predictions, with ONNX or without☆65Updated last year
- The Triton backend for the PyTorch TorchScript models.☆144Updated last week
- Torch Distributed Experimental☆115Updated 7 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆155Updated 3 months ago
- A top-like tool for monitoring GPUs in a cluster☆86Updated last year
- Sentence Embedding as a Service☆15Updated last year
- ☆12Updated last year
- benchmarking some transformer deployments☆26Updated last year
- A modular system for machinable research code☆35Updated 2 months ago
- Home for OctoML PyTorch Profiler☆108Updated last year
- Cortex-compatible model server for Python and TensorFlow☆17Updated 2 years ago
- Ray - A curated list of resources: https://github.com/ray-project/ray☆52Updated last month
- The Triton backend for the ONNX Runtime.☆140Updated last week
- A client library in Rust for Nvidia Triton.☆28Updated last year
- NLP with Rust for Python 🦀🐍☆61Updated 9 months ago
- Utilities for Training Very Large Models☆58Updated 5 months ago
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆44Updated 9 months ago
- MLCube® is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.☆154Updated 6 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated 8 months ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆109Updated 3 weeks ago
- ☆18Updated last year