mesolitica / vllm-whisperLinks
A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper
☆31Updated last year
Alternatives and similar repositories for vllm-whisper
Users that are interested in vllm-whisper are comparing it to the libraries listed below
Sorting:
- Open TTS models, built for streaming on the edge☆44Updated 10 months ago
- ONNX and TensorRT implementation of Whisper☆66Updated 2 years ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆138Updated 3 months ago
- openvino version of openai/whisper☆180Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆104Updated last year
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆143Updated 7 months ago
- A curated list of awesome voice activity detection☆71Updated last year
- ☆158Updated last month
- Open-source reproducible benchmarks from Argmax☆77Updated this week
- Putting flows on top of neural transducers for better TTS☆64Updated last month
- Collection of Open Source Speech Data☆164Updated 3 months ago
- On-device voice activity detection (VAD) powered by deep learning☆241Updated this week
- Official implementation of the TTS model Lina-Speech☆175Updated last year
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆179Updated last year
- Audio tokenization, in the fastest way possible!☆53Updated last year
- ☆385Updated last year
- Speaker Diarization with Transformers☆69Updated 7 months ago
- ☆259Updated last year
- ONNX Inference of Pyannote Segmentation☆97Updated last year
- Zero-shot Audio Classification using Whisper☆79Updated 3 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆92Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆151Updated 2 years ago
- This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the …☆54Updated last year
- A TTS model that makes a speaker speak new languages☆76Updated last year
- An unofficial PyTorch implementation of VALL-E☆88Updated 5 months ago
- A toolkit for processing speech data and creating speech datasets☆195Updated 3 months ago
- Implementation of Google's USM speech model in Pytorch☆34Updated this week
- Tunable pipelines☆41Updated 4 months ago
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated 2 years ago
- ☆275Updated last year