CorentinJ / transcription-diff
A python library to find differences between audio and transcriptions
β14Updated 10 months ago
Related projects: β
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β34Updated last week
- Make Kanye sing any song ya want π€π₯β23Updated last year
- β48Updated last month
- Trying to build an all in one speech-text language model - a bit like GPT-4oβ18Updated 3 months ago
- Streamlit app to visualize and edit TTS datasetsβ14Updated 2 years ago
- Create an LJSpeech structured voice dataset on wave inputβ16Updated 2 months ago
- β16Updated 11 months ago
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for β¦β12Updated 4 months ago
- β61Updated last month
- Your one-stop solution for voice dataset creationβ106Updated 9 months ago
- Repository for fine-tuning Transformers π€ based seq2seq speech models in JAX/Flax.β34Updated last year
- AudioLDM text to audio colabβ19Updated 10 months ago
- β34Updated 4 months ago
- Cog wrapper for collabora/WhisperSpeechβ23Updated 6 months ago
- The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.β9Updated last week
- Misc. tools/scripts that I made to use for tortoiseβ17Updated last month
- AudioBERT π’ : Audio Knowledge Augmented Language Modelβ14Updated this week
- (Windows/Linux) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on 3 langβ¦β37Updated this week
- VALL-E 2 reproductionβ72Updated 2 months ago
- β74Updated 2 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β74Updated 2 months ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.β149Updated 6 months ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.β44Updated 10 months ago
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.β12Updated 7 months ago
- Putting flows on top of neural transducers for better TTSβ63Updated last month
- β35Updated 3 weeks ago
- A TTS model that makes a speaker speak new languagesβ73Updated 3 months ago
- Site for sharing MusicGen + AudioGen Prompts and Creationsβ39Updated 2 months ago
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning β¦β22Updated last year
- β15Updated 7 months ago