CorentinJ / transcription-diff
A python library to find differences between audio and transcriptions
☆15Updated last year
Related projects ⓘ
Alternatives and complementary repositories for transcription-diff
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated 2 weeks ago
- ☆61Updated 3 months ago
- AudioLDM text to audio colab☆19Updated last year
- ☆54Updated this week
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆50Updated 3 years ago
- VI-SVC model is just VITS without MAS and DurationPredictor.☆10Updated last year
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆43Updated last month
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 5 months ago
- ☆26Updated 11 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆11Updated last month
- ☆29Updated 11 months ago
- Create an LJSpeech structured voice dataset on wave input☆21Updated last month
- Text To Speech Multilingual Support (+20 Language)☆35Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆11Updated 5 months ago
- Gradio Client in Rust.☆23Updated last month
- Site for sharing MusicGen + AudioGen Prompts and Creations☆39Updated 4 months ago
- 🎨 Imagine what Picasso could have done with AI. Self-host your StableDiffusion API.☆49Updated last year
- VALL-E 2 reproduction☆87Updated 4 months ago
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆33Updated last year
- Auto-Video maker handling many AI's☆12Updated 8 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆12Updated this week
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- ☆34Updated 6 months ago
- Use quantized versions of Whisper to speed up inference☆11Updated last month
- Heteronym to Phoneme Parser☆15Updated last year
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for …☆13Updated last month
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆45Updated last year
- ☆18Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- Make Kanye sing any song ya want 🎤🔥☆23Updated last year