project-codeflare / zero-copy-model-loadingLinks
In-depth code associated with my Medium blog post, "How to Load PyTorch Models 340 Times Faster with Ray"
☆28Updated 2 years ago
Alternatives and similar repositories for zero-copy-model-loading
Users that are interested in zero-copy-model-loading are comparing it to the libraries listed below
Sorting:
- Simple dependency injection framework for Python☆21Updated last year
- High-performance safetensors model loader☆40Updated this week
- Pygloo provides Python bindings for Gloo.☆21Updated 4 months ago
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆18Updated 2 years ago
- Sentence Embedding as a Service☆15Updated last year
- Distributed ML Optimizer☆32Updated 3 years ago
- A collection of reproducible inference engine benchmarks☆31Updated 2 months ago
- Productionize machine learning predictions, with ONNX or without☆65Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆241Updated 2 weeks ago
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- MLFlow Deployment Plugin for Ray Serve☆45Updated 3 years ago
- 🐍 Python bidding for the Hora Approximate Nearest Neighbor Search Algorithm library☆72Updated 3 years ago
- TorchFix - a linter for PyTorch-using code with autofix support☆143Updated 4 months ago
- Notes and artifacts from the ONNX steering committee☆26Updated this week
- A place to store reusable transformer components of my own creation or found on the interwebs☆56Updated last week
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆157Updated this week
- ☆13Updated 2 years ago
- FIL backend for the Triton Inference Server☆81Updated 3 weeks ago
- benchmarking some transformer deployments☆26Updated 2 years ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated 2 years ago
- Home for OctoML PyTorch Profiler☆113Updated 2 years ago
- DL Dataloader Benchmarks☆18Updated 5 months ago
- ☆13Updated last year
- ☆37Updated this week
- experiments with inference on llama☆104Updated last year
- Python bindings for UCX☆137Updated this week
- An Aspiring Drop-In Replacement for Pandas at Scale☆73Updated 3 years ago
- Lightning In-Memory Object Store☆46Updated 3 years ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆61Updated 2 months ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago