triton-inference-server / redis_cacheLinks
TRITONCACHE implementation of a Redis cache
☆14Updated 2 weeks ago
Alternatives and similar repositories for redis_cache
Users that are interested in redis_cache are comparing it to the libraries listed below
Sorting:
- First token cutoff sampling inference example☆30Updated last year
- A collection of reproducible inference engine benchmarks☆31Updated 2 months ago
- Creating Generative AI Apps which work☆17Updated 2 months ago
- Triton backend for managing the model state tensors automatically in sequence batcher☆17Updated last year
- xet client tech, used in huggingface_hub☆118Updated last week
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆11Updated last week
- MLFlow Deployment Plugin for Ray Serve☆45Updated 3 years ago
- ☆15Updated 2 months ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated 2 years ago
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆18Updated 2 years ago
- PostText is a QA system for querying your text data. When appropriate structured views are in place, PostText is good at answering querie…☆32Updated 2 years ago
- ☆13Updated 2 years ago
- ☆13Updated last year
- Core Utilities for NVIDIA Merlin☆19Updated 11 months ago
- Code for paper: "Privately generating tabular data using language models".☆15Updated 2 years ago
- ☆39Updated 2 years ago
- ☆18Updated 3 weeks ago
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and…☆26Updated 3 months ago
- ☆37Updated this week
- Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments☆20Updated 11 months ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 9 months ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated 2 years ago
- Visualize multi-model embedding spaces. The first goal is to quickly get a lay of the land of any embedding space. Then be able to scroll…☆27Updated last year
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆61Updated 2 months ago
- Tools for merging pretrained large language models.☆19Updated last year
- 👷 Build compute kernels☆68Updated this week
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆100Updated last week
- ☆49Updated this week