The core library and APIs implementing the Triton Inference Server.
☆169Feb 28, 2026Updated this week
Alternatives and similar repositories for core
Users that are interested in core are comparing it to the libraries listed below
Sorting:
- Common source, scripts and utilities for creating Triton backends.☆369Feb 9, 2026Updated 3 weeks ago
- The Triton backend for the ONNX Runtime.☆173Feb 25, 2026Updated last week
- Common source, scripts and utilities shared across all Triton repositories.☆79Updated this week
- Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.☆682Feb 24, 2026Updated last week
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆506Feb 17, 2026Updated 2 weeks ago
- The Triton backend for TensorRT.☆87Feb 9, 2026Updated 3 weeks ago
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆673Updated this week
- Triton backend for managing the model state tensors automatically in sequence batcher☆17Feb 12, 2024Updated 2 years ago
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆10,393Updated this week
- Rust crate for some audio utilities☆27Mar 8, 2025Updated 11 months ago
- The Triton backend for TensorFlow.☆56Nov 22, 2025Updated 3 months ago
- ☆331Feb 9, 2026Updated 3 weeks ago
- TRITONCACHE implementation of a Redis cache☆16Feb 9, 2026Updated 3 weeks ago
- OneFlow Serving☆21Apr 10, 2025Updated 10 months ago
- The Triton backend for the PyTorch TorchScript models.☆173Updated this week
- This repository contains tutorials and examples for Triton Inference Server☆825Feb 9, 2026Updated 3 weeks ago
- ☆24Jun 8, 2025Updated 8 months ago
- HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of…☆194Updated this week
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆73Updated this week
- Docker base images for C++ development using vcpkg☆10Jan 27, 2026Updated last month
- ☆11Oct 11, 2023Updated 2 years ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆13Jan 30, 2026Updated last month
- ☆413Nov 11, 2023Updated 2 years ago
- ☆22Feb 9, 2026Updated 3 weeks ago
- High-level API for tar-based dataset☆12Feb 3, 2024Updated 2 years ago
- Nsight Compute In Docker☆13Dec 21, 2023Updated 2 years ago
- ☆12Apr 5, 2019Updated 6 years ago
- FastAPI middleware for comparing different ML model serving approaches☆15Jul 5, 2023Updated 2 years ago
- Random collections of code examples.☆12Mar 19, 2025Updated 11 months ago
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆141Updated this week
- CUDA Core Compute Libraries☆2,182Updated this week
- Disaggregated serving system for Large Language Models (LLMs).☆777Apr 6, 2025Updated 10 months ago
- Blazing fast data loading with HuggingFace Dataset and Ray Data☆16Jan 12, 2024Updated 2 years ago
- NVIDIA Inference Xfer Library (NIXL)☆898Updated this week
- Transformer related optimization, including BERT, GPT☆6,398Mar 27, 2024Updated last year
- This repository contains statistics about the AI Infrastructure products.☆17Feb 27, 2025Updated last year
- tensor library☆17Jul 19, 2024Updated last year
- Proof of concept for running moshi/hibiki using webrtc☆20Feb 28, 2025Updated last year
- Almost-Pure Rust TTS Engine for my Rustnation talk☆50Jan 6, 2025Updated last year