matallanas / whisper_gpt_pipeline
A huggingface pipeline to train a gpt model based on the transcript obtained byt the Open AI whisper model
☆15Updated 2 years ago
Alternatives and similar repositories for whisper_gpt_pipeline:
Users that are interested in whisper_gpt_pipeline are comparing it to the libraries listed below
- ☆62Updated 8 months ago
- ☆156Updated last year
- ☆147Updated last year
- Zero-shot Audio Classification using Whisper☆80Updated 2 years ago
- 🔊 Text-prompted Generative Audio Model - With the ability to clone voices☆20Updated last year
- Repository contains code to fine-tune WhisperASR model☆23Updated 2 years ago
- ☆39Updated 11 months ago
- create dataset from list of youtube links easily☆17Updated 2 years ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆14Updated last month
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- Code for OpenAI Whisper Web App Demo☆93Updated 2 years ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆28Updated 2 years ago
- Command-line script for inferencing from models such as MPT-7B-Chat☆101Updated last year
- Examples of apps built with Nendo, the AI Audio Tool Suite☆55Updated last year
- BIG: Back In the Game of Creative AI☆27Updated 2 years ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 10 months ago
- ☆30Updated last year
- The demo page of UniAudio☆33Updated last year
- Towards Robust Blind Face Restoration with Codebook Lookup Transformer☆28Updated last year
- a simple system for 2-way interruptible voice interactions between human and LLM☆25Updated last year
- A simple voice conversion tool☆17Updated 3 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated 11 months ago
- ☆28Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆25Updated 8 months ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- A high-quality, varied ~30hr voice dataset suitable for training a TTS model☆59Updated 2 years ago
- ☆32Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last week
- ☆107Updated last year