triton-inference-server / openvino_backendLinks

OpenVINO backend for Triton.

☆34

Alternatives and similar repositories for openvino_backend

Users that are interested in openvino_backend are comparing it to the libraries listed below

Sorting:

triton-inference-server / onnxruntime_backend
The Triton backend for the ONNX Runtime.
☆167Updated this week
triton-inference-server / pytorch_backend
The Triton backend for the PyTorch TorchScript models.
☆164Updated this week
triton-inference-server / perf_analyzer
☆118Updated this week
triton-inference-server / backend
Common source, scripts and utilities for creating Triton backends.
☆358Updated this week
triton-inference-server / model_navigator
Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.
☆213Updated 6 months ago
triton-inference-server / dali_backend
The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
☆138Updated last week
triton-inference-server / model_analyzer
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…
☆496Updated this week
triton-inference-server / triton_cli
Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…
☆71Updated this week
triton-inference-server / tensorrt_backend
The Triton backend for TensorRT.
☆79Updated this week
triton-inference-server / common
Common source, scripts and utilities shared across all Triton repositories.
☆77Updated this week
triton-inference-server / vllm_backend
☆312Updated this week
triton-inference-server / tensorflow_backend
The Triton backend for TensorFlow.
☆53Updated 2 weeks ago
mlcommons / logging
MLPerf™ logging library
☆37Updated last month
microsoft / onnxconverter-common
Common utilities for ONNX converters
☆284Updated 2 months ago
onnx / neural-compressor
Model compression for ONNX
☆98Updated last year
triton-inference-server / core
The core library and APIs implementing the Triton Inference Server.
☆155Updated this week
triton-inference-server / client
Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
☆660Updated this week
triton-inference-server / paddlepaddle_backend
☆36Updated last year
triton-inference-server / developer_tools
☆21Updated this week
YH-Wu / Triton-Inference-Server-on-Kubernetes
☆33Updated 3 years ago
intel / torch-ccl
oneCCL Bindings for Pytorch* (deprecated)
☆102Updated 2 weeks ago
gpuopenanalytics / pynvml
Provide Python access to the NVML library for GPU diagnostics
☆249Updated 2 months ago
HabanaAI / Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
☆167Updated last month
mlcommons / inference_policies
Issues related to MLPerf® Inference policies, including rules and suggested changes
☆64Updated last week
HabanaAI / vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
☆85Updated last week
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆63Updated 4 months ago
NetEase-FuXi / EETQ
Easy and Efficient Quantization for Transformers
☆202Updated 4 months ago
onnx / steering-committee
Notes and artifacts from the ONNX steering committee
☆27Updated 2 weeks ago
triton-inference-server / python_backend
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
☆655Updated this week
microsoft / onnxscript
ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.
☆408Updated this week