triton-inference-server / developer_tools
☆18Updated this week
Alternatives and similar repositories for developer_tools:
Users that are interested in developer_tools are comparing it to the libraries listed below
- The Triton backend for TensorRT.☆70Updated last month
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆132Updated last week
- The Triton backend for the ONNX Runtime.☆140Updated this week
- OpenVINO backend for Triton.☆31Updated this week
- ☆31Updated 2 years ago
- FIL backend for the Triton Inference Server☆77Updated last week
- Wanwu models release, code will be released soon☆24Updated 2 years ago
- Model compression for ONNX☆91Updated 5 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆199Updated 3 months ago
- Demonstration of the use of TensorRT and TRITON☆16Updated 4 years ago
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆20Updated last week
- ☆9Updated 2 years ago
- Common source, scripts and utilities for creating Triton backends.☆315Updated this week
- TensorFlow and TVM integration☆37Updated 4 years ago
- AI-related samples made available by the DevTech ProViz team☆29Updated last year
- ONNX Command-Line Toolbox☆35Updated 6 months ago
- ☆60Updated this week
- An easy way to run, test, benchmark and tune OpenCL kernel files☆23Updated last year
- This repository provides optical character detection and recognition solution optimized on Nvidia devices.☆74Updated last week
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆97Updated 3 weeks ago
- ☆63Updated 2 years ago
- Home for OctoML PyTorch Profiler☆112Updated last year
- The Triton backend for the PyTorch TorchScript models.☆146Updated this week
- Tutorial on how to convert machine learned models into ONNX☆16Updated 2 years ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆65Updated 3 years ago
- RidgeRun Inference Framework☆27Updated 2 years ago
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆180Updated 4 months ago
- ☆33Updated last year
- Benchmark of TVM quantized model on CUDA☆111Updated 4 years ago
- ☆69Updated 2 years ago