zzk0 / tritonLinks

Triton Inferece Server Model Config and Client Scripts

☆32

Alternatives and similar repositories for triton

Users that are interested in triton are comparing it to the libraries listed below

Sorting:

bug-developer021 / YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
☆52Updated last year
dingyuqing05 / trt2022_wenet
☆72Updated 2 years ago
torchpipe / torchpipe
Serving Inside Pytorch
☆163Updated last month
Tlntin / ChatGLM2-6B-TensorRT
☆90Updated 2 years ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated 9 months ago
BBuf / onnx_learn
☆100Updated 4 years ago
PaddlePaddle / Paddle-Inference-Demo
☆266Updated 5 months ago
triton-inference-server / paddlepaddle_backend
☆36Updated last year
DeepVAC / libdeepvac
Use PyTorch model in C++ project
☆140Updated 4 years ago
zhaohb / fastapi_tritonserver
☆27Updated 11 months ago
Deep-Spark / DeepSparkInference
DeepSparkInference has selected 216 inference models of both small and large sizes. The small models cover fields such as computer vision…
☆25Updated last week
wangzyon / pyInfer
async inference for machine learning model
☆26Updated 3 years ago
isarsoft / yolov4-triton-tensorrt
This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server
☆287Updated 3 years ago
Bobo-y / triton_ensemble_model_demo
triton server ensemble model demo
☆30Updated 3 years ago
layerism / TensorRT-Inference-Server-Tutorial
服务侧深度学习部署案例
☆454Updated 5 years ago
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆161Updated last year
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
Tlntin / trt2023
☆26Updated 2 years ago
Oldpan / DeployIsAllYouNeed
☆120Updated 2 years ago
Rayrtfr / FasterTransformer
Transformer related optimization, including BERT, GPT
☆17Updated 2 years ago
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆314Updated last month
MAhaitao999 / Yolov3_Dynamic_Batch_TensorRT_Triton
将Yolov3模型转成可以进行动态Batch的TensorRT推理以及Triton Inference Serving上部署的TensorRT模型
☆29Updated 4 years ago
wang-xinyu / pytorchx
Implement popular deep learning networks in pytorch, used by tensorrtx.
☆197Updated 3 years ago
col-in-coding / Tensorrt-CV
Using TensorRT for Inference Model Deployment.
☆49Updated last year
tienluongngoc / yolov5_triton_inference_server
☆53Updated 3 years ago
itsliupeng / torchnvjpeg
Decode JPEG image on GPU using PyTorch
☆92Updated 2 years ago
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆50Updated 2 years ago
wangzhaode / onnx-llm
llm deploy project based onnx.
☆45Updated last year
drcut / NN_transform
Trans different platform's network to International Representation(IR)
☆44Updated 7 years ago
sunbelbd / mobius
Möbius Transformation for Fast Inner Product Search on Graph
☆22Updated 4 years ago