matallanas / whisper_gpt_pipeline
A huggingface pipeline to train a gpt model based on the transcript obtained byt the Open AI whisper model
☆15Updated last year
Related projects ⓘ
Alternatives and complementary repositories for whisper_gpt_pipeline
- ☆152Updated last year
- ☆34Updated 6 months ago
- ☆61Updated 3 months ago
- Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023☆47Updated last year
- Examples of apps built with Nendo, the AI Audio Tool Suite☆56Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆42Updated this week
- [WIP] AI Try-On plugin for Chrome☆24Updated 7 months ago
- VoiceBox neural network implementation☆96Updated 3 months ago
- BIG: Back In the Game of Creative AI☆25Updated last year
- VALL-E 2 reproduction☆83Updated 3 months ago
- ☆251Updated 7 months ago
- Gradio Client in Rust.☆23Updated 3 weeks ago
- ☆145Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆133Updated last year
- 🍳 AyaMCooking is a Voice-to-Voice Mutli-lingual RAG Agent that makes a perfect sous chef for your kitchen, in upto 10 Languages 🤌🧑🍳☆16Updated 2 weeks ago
- Speaker Diarization with Transformers☆59Updated 5 months ago
- ☆35Updated last year
- ☆15Updated last year
- Zero-shot Audio Classification using Whisper☆74Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆53Updated 6 months ago
- Sing an idea ➡️ AI music sample🔥🎶☆90Updated 6 months ago
- Transcription with speaker diarization pipeline☆85Updated last year
- ☆30Updated 10 months ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆27Updated last year
- Repository contains code to fine-tune WhisperASR model☆23Updated last year
- Towards Robust Blind Face Restoration with Codebook Lookup Transformer☆27Updated 9 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated 6 months ago
- ☆51Updated last week
- Basic framework for training Dreambooth Stable Diffusion v1.5 on Banana's v1.0 serverless GPU platform☆36Updated last year
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆42Updated 3 months ago