The Triton backend for TensorRT.
☆88May 8, 2026Updated 2 weeks ago
Alternatives and similar repositories for tensorrt_backend
Users that are interested in tensorrt_backend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Common source, scripts and utilities for creating Triton backends.☆372Updated this week
- Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.☆690Updated this week
- The Triton backend for the ONNX Runtime.☆176Updated this week
- TRITONCACHE implementation of a Redis cache☆17May 8, 2026Updated 2 weeks ago
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆677Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- OpenVINO backend for Triton.☆37May 8, 2026Updated 2 weeks ago
- ☆341May 8, 2026Updated 2 weeks ago
- The Triton TensorRT-LLM Backend☆934May 7, 2026Updated 2 weeks ago
- The Triton backend for the PyTorch TorchScript models.☆178May 15, 2026Updated last week
- ☆27Nov 6, 2024Updated last year
- Common source, scripts and utilities shared across all Triton repositories.☆79May 8, 2026Updated 2 weeks ago
- A DeepStream sample application demonstrating end-to-end retail video analytics for brick-and-mortar retail.☆56Oct 13, 2022Updated 3 years ago
- Provides an ensemble model to deploy a YoloV8 ONNX model to Triton☆42Oct 19, 2023Updated 2 years ago
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆10,687Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repository contains tutorials and examples for Triton Inference Server☆838May 8, 2026Updated 2 weeks ago
- custom payload for send nvdsanalytics message to kafka☆23Nov 16, 2022Updated 3 years ago
- Using OpenVINO to speed up MeloTTS inference☆15Nov 1, 2024Updated last year
- Unofficial golang package for the Triton Inference Server(https://github.com/triton-inference-server/server)☆50May 15, 2026Updated last week
- Rust bindings to the Triton Inference Server☆19Mar 14, 2024Updated 2 years ago
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Oct 5, 2023Updated 2 years ago
- A project demonstrating how to make DeepStream docker images.☆92Apr 20, 2026Updated last month
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆23Oct 14, 2025Updated 7 months ago
- The vLLM XPU kernels for Intel GPU☆44Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆11Oct 11, 2023Updated 2 years ago
- Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.☆10Jan 16, 2022Updated 4 years ago
- Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator☆12Apr 28, 2024Updated 2 years ago
- ☆25Oct 10, 2022Updated 3 years ago
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆509May 13, 2026Updated last week
- A simple tool that can generate TensorRT plugin code quickly.☆241Jul 11, 2023Updated 2 years ago
- yolov5-deepsort+opencv.kcf+TensorRT+QT☆29Jan 20, 2022Updated 4 years ago
- NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source compone…☆13,000Apr 13, 2026Updated last month
- ☆412Nov 11, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Dec 12, 2024Updated last year
- Nvidia HairWorks OpenGL implementation☆12Apr 30, 2016Updated 10 years ago
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆844Aug 13, 2025Updated 9 months ago
- Official implementation of the ICLR 2024 paper AffineQuant☆30Mar 30, 2024Updated 2 years ago
- A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresse…☆2,750Updated this week
- SPRINT: Script-agnostic Structure Recognition in Tables☆16Mar 26, 2025Updated last year
- Agile assessment exercise ideas☆15Apr 14, 2025Updated last year