triton-inference-server / paddlepaddle_backend
☆32Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for paddlepaddle_backend
- Compare multiple optimization methods on triton to imporve model service performance☆46Updated 10 months ago
- Triton Inferece Server Model Config and Client Scripts☆31Updated 2 years ago
- OneFlow->ONNX☆42Updated last year
- PaddlePaddle Developer Community☆88Updated this week
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆125Updated 2 weeks ago
- Simple Dynamic Batching Inference☆145Updated 2 years ago
- Transformer related optimization, including BERT, GPT☆17Updated last year
- ☆242Updated last week
- Serving Inside Pytorch☆145Updated this week
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆44Updated last year
- TensorRT Plugin Autogen Tool☆367Updated last year
- Models and examples built with OneFlow☆96Updated last month
- ☆123Updated 2 weeks ago
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆70Updated this week
- ☆96Updated 3 years ago
- Common source, scripts and utilities shared across all Triton repositories.☆62Updated this week
- ☆76Updated last week
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆185Updated 2 months ago
- ☆70Updated last year
- ☆140Updated 6 months ago
- triton server ensemble model demo☆30Updated 2 years ago
- Wanwu models release, code will be released soon☆24Updated 2 years ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆47Updated last year
- Transformer related optimization, including BERT, GPT☆39Updated last year
- A Toolkit to Help Optimize Onnx Model☆80Updated this week
- run ChatGLM2-6B in BM1684X☆48Updated 8 months ago
- Transformer related optimization, including BERT, GPT☆60Updated last year
- Datasets, Transforms and Models specific to Computer Vision☆82Updated last year
- 将Yolov3模型转成可以进行动态Batch的TensorRT推理以及Triton Inference Serving上部署的TensorRT模型☆27Updated 3 years ago
- DeepLearning Framework Performance Profiling Toolkit☆277Updated 2 years ago