The Triton backend for TensorRT.
☆86Mar 10, 2026Updated last week
Alternatives and similar repositories for tensorrt_backend
Users that are interested in tensorrt_backend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Common source, scripts and utilities for creating Triton backends.☆369Mar 10, 2026Updated last week
- ☆13Aug 23, 2024Updated last year
- Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.☆686Mar 10, 2026Updated last week
- The Triton backend for the ONNX Runtime.☆172Updated this week
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆672Updated this week
- OpenVINO backend for Triton.☆37Updated this week
- ☆334Mar 17, 2026Updated last week
- The Triton TensorRT-LLM Backend☆926Updated this week
- The core library and APIs implementing the Triton Inference Server.☆170Updated this week
- ☆22Mar 10, 2026Updated last week
- Rust crate for submitting inference requests to machine learning models☆15May 24, 2024Updated last year
- The Triton backend for the PyTorch TorchScript models.☆174Mar 16, 2026Updated last week
- ☆28Nov 6, 2024Updated last year
- A DeepStream sample application demonstrating end-to-end retail video analytics for brick-and-mortar retail.☆54Oct 13, 2022Updated 3 years ago
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆10,446Updated this week
- Provides an ensemble model to deploy a YoloV8 ONNX model to Triton☆41Oct 19, 2023Updated 2 years ago
- This repository contains tutorials and examples for Triton Inference Server☆823Mar 10, 2026Updated last week
- Unofficial golang package for the Triton Inference Server(https://github.com/triton-inference-server/server)☆50Updated this week
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆220Feb 3, 2026Updated last month
- Rust bindings to the Triton Inference Server☆19Mar 14, 2024Updated 2 years ago
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆21Oct 14, 2025Updated 5 months ago
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Oct 5, 2023Updated 2 years ago
- A project demonstrating how to make DeepStream docker images.☆93Oct 1, 2025Updated 5 months ago
- ☆11Oct 11, 2023Updated 2 years ago
- Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.☆10Jan 16, 2022Updated 4 years ago
- Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator☆12Apr 28, 2024Updated last year
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆507Updated this week
- A simple tool that can generate TensorRT plugin code quickly.☆240Jul 11, 2023Updated 2 years ago
- ☆30Jan 22, 2026Updated 2 months ago
- (SIGIR 25) Repo for "Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation"☆10Jan 18, 2025Updated last year
- Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> O…☆33Aug 18, 2021Updated 4 years ago
- ☆413Nov 11, 2023Updated 2 years ago
- NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source compone…☆12,800Mar 9, 2026Updated 2 weeks ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- PaddleOCR Lite license plate detection on bare Raspberry Pi 4☆10Apr 16, 2024Updated last year
- Custom gst-nvinfer for alignment in Deepstream☆31Nov 22, 2024Updated last year
- Code for the paper "On the Importance of Feature Decorrelation for Unsupervised Representation Learning for RL" (ICML 2023)☆12Jun 13, 2023Updated 2 years ago
- A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresse…☆2,218Updated this week
- YOLOv9 TensorRT C++ Implementation☆43Nov 15, 2024Updated last year