Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.
☆74Jun 8, 2026Updated last week
Alternatives and similar repositories for triton_cli
Users that are interested in triton_cli are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13May 8, 2023Updated 3 years ago
- Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use …☆10May 29, 2024Updated 2 years ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆222May 27, 2026Updated 2 weeks ago
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆676Updated this week
- This repository contains tutorials and examples for Triton Inference Server☆840Updated this week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- OpenAI compatible API for TensorRT LLM triton backend☆221Aug 1, 2024Updated last year
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆513Updated this week
- Fuses IMU readings with a complementary filter to achieve accurate pitch and roll readings.☆15Aug 23, 2021Updated 4 years ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆161Updated this week
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆844Aug 13, 2025Updated 10 months ago
- The Triton TensorRT-LLM Backend☆935Updated this week
- A CUDA kernel optimization toolkit for validation, benchmarking, Nsight Compute profiling, bottleneck analysis, and iterative tuning. It …☆177Apr 22, 2026Updated last month
- ☆22Updated this week
- Quotek is an open source algotrading platform, written in C++.☆10Nov 12, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Multiple GEMM operators are constructed with cutlass to support LLM inference.☆20Aug 3, 2025Updated 10 months ago
- Multi-layer perceptron, Autoencoder, and Restricted Boltzmann Machine☆10Sep 15, 2018Updated 7 years ago
- The core library and APIs implementing the Triton Inference Server.☆173Updated this week
- Fork of FakeSMC, PlugIns, and HwMonitor (based on slice's branch)☆14Mar 19, 2013Updated 13 years ago
- An NVIDIA Triton Server workflow for OCR and the LayoutLMv3 Transformer Model☆30Sep 14, 2022Updated 3 years ago
- ☆14Updated this week
- Compare multiple optimization methods on triton to imporve model service performance☆52Jan 10, 2024Updated 2 years ago
- Repository for open inference protocol specification☆72May 12, 2025Updated last year
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆49Jul 17, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- react-native-web + native-base-web starter kit (Boilerplate)☆12Dec 30, 2016Updated 9 years ago
- Get GDDR5 memory information and other information from AMD Radeon GPUs.☆13May 26, 2018Updated 8 years ago
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆236May 15, 2026Updated last month
- Ansible Playbooks/Roles to Clone OCP 4.x Repos and Build Supporting Infrastructure Needed for UPI☆10Aug 25, 2021Updated 4 years ago
- A brief understanding of ffmpeg cli through pseudocode☆11Dec 20, 2020Updated 5 years ago
- ☆21May 30, 2024Updated 2 years ago
- Run cloud native workloads on NVIDIA GPUs☆238Jan 22, 2026Updated 4 months ago
- The Triton backend for TensorFlow.☆56Nov 22, 2025Updated 6 months ago
- ☆66Apr 26, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆345Updated this week
- ☆14Jun 10, 2023Updated 3 years ago
- Common source, scripts and utilities for creating Triton backends.☆373Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆121Updated this week
- ☆13Dec 3, 2021Updated 4 years ago
- ☆33Feb 3, 2025Updated last year
- Linux kernel SGX driver for Graphene☆12Nov 3, 2020Updated 5 years ago