triton-inference-server / redis_cacheLinks
TRITONCACHE implementation of a Redis cache
☆16Updated last week
Alternatives and similar repositories for redis_cache
Users that are interested in redis_cache are comparing it to the libraries listed below
Sorting:
- Module, Model, and Tensor Serialization/Deserialization☆285Updated 5 months ago
- Triton backend for managing the model state tensors automatically in sequence batcher☆17Updated last year
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆411Updated this week
- ☆43Updated last week
- xet client tech, used in huggingface_hub☆393Updated this week
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆121Updated this week
- Unified storage framework for the entire machine learning lifecycle☆155Updated last year
- The Triton backend for the ONNX Runtime.☆171Updated last week
- MLFlow Deployment Plugin for Ray Serve☆46Updated 3 years ago
- The Triton backend for the PyTorch TorchScript models.☆171Updated last week
- FIL backend for the Triton Inference Server☆87Updated last week
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆17Updated 3 years ago
- Repository for open inference protocol specification☆63Updated 8 months ago
- ☆31Updated 9 months ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated 3 years ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆63Updated 4 months ago
- TorchFix - a linter for PyTorch-using code with autofix support☆152Updated 5 months ago
- ☆60Updated this week
- Python bindings for UCX☆140Updated 4 months ago
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆153Updated last year
- Distributed XGBoost on Ray☆152Updated last year
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆215Updated 9 months ago
- ☆72Updated this week
- cuVS - a library for vector search and clustering on the GPU☆615Updated this week
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆60Updated 2 years ago
- ☆59Updated 2 years ago
- ☆151Updated 2 weeks ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- A Cloud-Native WAL Storage Implementation☆73Updated this week