diegofiori / benchmark-pytorch2.0-with-nebullvm
☆9Updated last year
Related projects ⓘ
Alternatives and complementary repositories for benchmark-pytorch2.0-with-nebullvm
- Model compression for ONNX☆73Updated 3 weeks ago
- A project that optimizes Whisper for low latency inference using NVIDIA TensorRT☆59Updated 3 weeks ago
- A tool convert TensorRT engine/plan to a fake onnx☆37Updated last year
- New operators for the ReferenceEvaluator, new kernels for onnxruntime, CPU, CUDA☆28Updated last month
- The Triton backend for the ONNX Runtime.☆129Updated this week
- ONNX and TensorRT implementation of Whisper☆58Updated last year
- The Triton backend for TensorRT.☆62Updated this week
- Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> O…☆31Updated 3 years ago
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆47Updated this week
- A Toolkit to Help Optimize Onnx Model☆75Updated this week
- Article about deploying machine learning models using grpc, pytorch and asyncio☆24Updated last year
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆36Updated 5 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆13Updated 3 months ago
- A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB,…☆14Updated 6 months ago
- An open source implementation of CLIP.☆32Updated 2 years ago
- Collection of models and extensions for deployment in PyTorch☆24Updated last year
- Torchserve + TensorRT + Detection☆18Updated 2 years ago
- Count number of parameters / MACs / FLOPS for ONNX models.☆88Updated 2 weeks ago
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆123Updated this week
- ONNX Command-Line Toolbox☆35Updated 3 weeks ago
- Wanwu models release, code will be released soon☆24Updated 2 years ago
- This repository provides optical character detection and recognition solution optimized on Nvidia devices.☆53Updated 3 weeks ago
- ☆51Updated last week
- The Triton backend for the PyTorch TorchScript models.☆123Updated this week
- Materials for demonstrating video model deployment☆17Updated 4 years ago
- ☆30Updated 2 years ago
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Updated last year
- ☆189Updated this week
- Scailable ONNX python tools☆96Updated 2 weeks ago
- ☆156Updated last year