Kowalski1024 / Mi-GoLinks
Mi-Go is an open-source test framework designed to evaluate and compare the accuracy of speech-to-text models on YouTube dataset.
☆12Updated last year
Alternatives and similar repositories for Mi-Go
Users that are interested in Mi-Go are comparing it to the libraries listed below
Sorting:
- Target speaker automatic speech recognition (TS-ASR)☆11Updated last year
- Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021☆11Updated 4 years ago
- Anaouder mouezh e Brezhoneg gant Vosk☆14Updated 2 months ago
- Sequence to sequence model for Arabic punctuation prediction.☆12Updated 5 years ago
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆11Updated 2 weeks ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆18Updated 6 months ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Updated 11 months ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Updated 2 years ago
- ☆17Updated last year
- Project of Singing Voice Conversion.☆15Updated last year
- DysfluentWFST☆14Updated 2 weeks ago
- 🎵 muse: Music Separation☆10Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Updated 3 months ago
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆14Updated 11 months ago
- Whisper finetuning☆14Updated 6 months ago
- Whisper Speech Quality Assessment (WhiSQA)☆15Updated 10 months ago
- ☆16Updated 5 months ago
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆13Updated 3 years ago
- Evaluation of STT models for german language☆15Updated 3 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 4 months ago
- ☆15Updated 3 months ago
- PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…☆16Updated 2 months ago
- Test Framework for few-shot open set KWS☆36Updated 11 months ago
- Forced alignment decoder for Whisper.☆14Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆17Updated 2 months ago
- ☆13Updated this week
- ☆13Updated 2 years ago
- 🫠 check your data, before you wreck your model☆16Updated 3 years ago
- Getting confidences from any end-to-end systems☆11Updated 2 years ago
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆29Updated 3 months ago