triton-inference-server / redis_cacheLinks
TRITONCACHE implementation of a Redis cache
☆14Updated 3 weeks ago
Alternatives and similar repositories for redis_cache
Users that are interested in redis_cache are comparing it to the libraries listed below
Sorting:
- Module, Model, and Tensor Serialization/Deserialization☆260Updated 2 weeks ago
- Triton backend for managing the model state tensors automatically in sequence batcher☆18Updated last year
- Unified storage framework for the entire machine learning lifecycle☆156Updated last year
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆108Updated this week
- xet client tech, used in huggingface_hub☆183Updated last week
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- ☆38Updated last week
- MLFlow Deployment Plugin for Ray Serve☆46Updated 3 years ago
- The Triton backend for the PyTorch TorchScript models.☆159Updated 3 weeks ago
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆18Updated 3 years ago
- TorchFix - a linter for PyTorch-using code with autofix support☆145Updated last week
- ☆31Updated 4 months ago
- FIL backend for the Triton Inference Server☆82Updated 3 weeks ago
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆387Updated this week
- The Triton backend for the ONNX Runtime.☆159Updated this week
- A collection of reproducible inference engine benchmarks☆32Updated 4 months ago
- First token cutoff sampling inference example☆31Updated last year
- A minimal shared memory object store design☆53Updated 8 years ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆392Updated this week
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆122Updated last month
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 3 months ago
- Ray-based Apache Beam runner☆41Updated 2 years ago
- ☆15Updated 2 weeks ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆159Updated 2 months ago
- ☆53Updated this week
- ☆14Updated 3 years ago
- ☆12Updated last year
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated 2 years ago