triton-inference-server / stateful_backend
Triton backend for managing the model state tensors automatically in sequence batcher
☆13Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for stateful_backend
- The Triton backend for the ONNX Runtime.☆133Updated this week
- ☆52Updated last year
- Cortex-compatible model server for Python and TensorFlow☆17Updated last year
- TRITONCACHE implementation of a Redis cache☆12Updated this week
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Updated 3 years ago
- 参考faiss4j,已经废弃,采用c版本rpc通信的形式☆12Updated 2 years ago
- Faiss server for efficient similarity search and clustering of dense vectors☆23Updated 2 years ago
- Sentence Embedding as a Service☆14Updated last year
- Faiss bindings for Java☆22Updated 4 years ago
- Common source, scripts and utilities shared across all Triton repositories.☆62Updated this week
- benchmarking some transformer deployments☆26Updated last year
- Coqui Inference Engine☆38Updated 3 years ago
- Read-only unofficial mirror of OpenFst☆43Updated 2 years ago
- Efficient and effective query auto-completion in C++.☆51Updated last year
- 🌏 Modular retrievers for zero-shot multilingual IR.☆27Updated 8 months ago
- Tutorial on how to convert machine learned models into ONNX☆15Updated last year
- Distributed ML Optimizer☆30Updated 3 years ago
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆51Updated this week
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆185Updated 2 months ago
- ☆86Updated 2 years ago
- Distributed Approximate Nearest Neighbors Database https://anndb.com☆35Updated 3 years ago
- A C++ library providing fast language model queries in compressed space.☆128Updated last year
- MozoLM: A language model (LM) serving library☆44Updated 3 weeks ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- ☆30Updated 2 years ago
- ☆30Updated 2 years ago
- Conversational AI Benchmark.☆65Updated last year
- gRPC server over a FAISS index☆17Updated 3 years ago
- ☆15Updated last year
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…☆24Updated 3 years ago