NVIDIA-AI-IOT / whisper_trtLinks
A project that optimizes Whisper for low latency inference using NVIDIA TensorRT
☆95Updated last year
Alternatives and similar repositories for whisper_trt
Users that are interested in whisper_trt are comparing it to the libraries listed below
Sorting:
- ONNX and TensorRT implementation of Whisper☆65Updated 2 years ago
- ONNX implementation of Whisper. PyTorch free.☆102Updated last year
- ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT☆218Updated last year
- Riva Python client API and CLI utils☆114Updated 3 weeks ago
- A Toolkit to Help Optimize Onnx Model☆267Updated last week
- NVIDIA Riva runnable tutorials☆158Updated 3 weeks ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆138Updated 6 months ago
- ☆118Updated last week
- Experiments to test different speech recognition systems for SEPIA Framework☆62Updated 2 years ago
- [WACV 2026] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation☆19Updated last week
- Zero-copy multimodal vector DB with CUDA and CLIP/SigLIP☆63Updated 7 months ago
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆336Updated last year
- A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB,…☆17Updated 2 months ago
- ☆106Updated last month
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆31Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆67Updated last year
- Sample C++ command-line Riva clients.☆36Updated 3 weeks ago
- This repository provides optical character detection and recognition solution optimized on Nvidia devices.☆83Updated 6 months ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆178Updated last year
- Using FastChat-T5 Large Language Model, Vosk API for automatic speech recognition, and Piper for text-to-speech☆124Updated 2 years ago
- A toolkit for processing speech data and creating speech datasets☆189Updated 2 months ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆127Updated 2 months ago
- A collection of all our phonemeizers for dataset construction and inference☆27Updated 9 months ago
- ☆157Updated 3 weeks ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆48Updated last year
- libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻☆62Updated 2 years ago
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX.☆62Updated last year
- Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GP…☆36Updated 9 months ago
- A reference application for a local AI assistant with LLM and RAG☆117Updated last year
- A simple, hackable text-to-speech system in PyTorch and MLX☆183Updated 4 months ago