gaspardpetit / verbatimLinks
A composition of offline tools to achieve high quality multilingual speech to text transcription
☆22Updated 2 months ago
Alternatives and similar repositories for verbatim
Users that are interested in verbatim are comparing it to the libraries listed below
Sorting:
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆21Updated 3 months ago
 - ☆14Updated last year
 - SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated 2 years ago
 - Collection of scripts from mHuBERT-147.☆31Updated 11 months ago
 - Dippy Synthetic Speech Subnet☆17Updated last month
 - A collection of all our phonemeizers for dataset construction and inference☆27Updated 8 months ago
 - Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 5 months ago
 - ☆16Updated 6 months ago
 - Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated 2 years ago
 - ☆14Updated 2 years ago
 - This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆19Updated 11 months ago
 - ☆24Updated 5 months ago
 - speaker-disentangled speech linguistic content quantizer☆22Updated 7 months ago
 - Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆15Updated last year
 - The Vokan Architecture (Tsukasa speech based)☆10Updated 8 months ago
 - C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year
 - ☆14Updated 2 years ago
 - This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆29Updated last week
 - StyleTTS 2 Optimized Training Fork☆34Updated 9 months ago
 - KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Updated 2 years ago
 - Open TTS models, built for streaming on the edge☆43Updated 7 months ago
 - Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 5 months ago
 - Text-to-Speech Latency Benchmark☆18Updated 4 months ago
 - Audio tokenization, in the fastest way possible!☆53Updated last year
 - Forced alignment decoder for Whisper.☆14Updated last year
 - ☆44Updated 3 months ago
 - ☆25Updated last week
 - ☆40Updated 3 months ago
 - Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆29Updated 2 years ago
 - ☆27Updated last year