fquirin / speech-recognition-experiments
Experiments to test different speech recognition systems for SEPIA Framework
☆60Updated last year
Alternatives and similar repositories for speech-recognition-experiments:
Users that are interested in speech-recognition-experiments are comparing it to the libraries listed below
- Zero-shot Audio Classification using Whisper☆80Updated 2 years ago
- ONNX Inference of Pyannote Segmentation☆84Updated 3 months ago
- Apply machine learning model DTLN for noise suppression and acoustic echo cancellation on Raspberry Pi☆63Updated 3 years ago
- C++ version of pyannote audio speaker diarizaiton pipeline☆20Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆111Updated 2 years ago
- A sample Android app using [whisper.cpp](https://github.com/ggerganov/whisper.cpp/) to do voice-to-text transcriptions.☆64Updated last year
- On-device voice activity detection (VAD) powered by deep learning☆206Updated this week
- ONNX and TensorRT implementation of Whisper☆61Updated last year
- How to create your own model for vosk☆72Updated 3 years ago
- Python Wrapper of Silero VAD☆50Updated 3 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 6 months ago
- Python bindings of speexdsp noise suppression library☆38Updated 2 years ago
- SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech den…☆72Updated 8 months ago
- A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.☆52Updated last week
- Tunable pipelines☆32Updated last month
- A simple, but performant framework for mapping speech directly to categories and intents.☆19Updated 8 months ago
- libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻☆60Updated last year
- Real-Time Whisper Voice Recognition with vosk model feedback.☆111Updated last year
- Kaldi-compatible online fbank extractor without external dependencies☆92Updated 2 weeks ago
- Streaming TTS based on Piper with optional RK3588 NPU support☆79Updated 3 months ago
- openvino version of openai/whisper☆166Updated last year
- Onnx wrapper for espnet infrernce model☆162Updated 6 months ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆93Updated last year
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆65Updated 2 years ago
- ☆26Updated 2 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 11 months ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆166Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 2 months ago
- Port of Funasr's Paraformer model in C/C++☆31Updated 9 months ago
- Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and …☆20Updated 2 years ago