matallanas / whisper_gpt_pipelineLinks
A huggingface pipeline to train a gpt model based on the transcript obtained byt the Open AI whisper model
☆17Updated 2 years ago
Alternatives and similar repositories for whisper_gpt_pipeline
Users that are interested in whisper_gpt_pipeline are comparing it to the libraries listed below
Sorting:
- ☆62Updated last year
- ☆158Updated 2 years ago
- Speaker Diarization with Transformers☆69Updated 5 months ago
- The demo page of UniAudio☆34Updated last year
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆38Updated 2 years ago
- openvino version of openai/whisper☆177Updated 2 years ago
- Sing an idea ➡️ AI music sample🔥🎶☆118Updated last year
- Open TTS models, built for streaming on the edge☆44Updated 8 months ago
- Zero-shot Audio Classification using Whisper☆78Updated 2 years ago
- Command-line script for inferencing from models such as MPT-7B-Chat☆100Updated 2 years ago
- BSRGAN-Pip: Packaged version of the BSRGAN repository☆14Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆31Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- 🔊 Text-prompted Generative Audio Model - With the ability to clone voices☆19Updated 2 years ago
- ☆86Updated 2 years ago
- 🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation☆259Updated last week
- [WIP] A 🔥 interface for running code in the cloud☆85Updated 2 years ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated last year
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆28Updated 2 years ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆97Updated last year
- ☆261Updated last year
- Google Colab-backed Web UI for creating music with OpenAI Jukebox☆84Updated 2 years ago
- Make-A-Video Latent Diffusion Model☆19Updated 2 years ago
- Examples of apps built with Nendo, the AI Audio Tool Suite☆55Updated last year
- ☆359Updated last year
- Improving transcription performance of OpenAI Whisper for CPU based deployment☆256Updated 3 years ago
- 🐸Coqui Dialogue Audio Pack contains more than 2000 audio files of synthetic human voices over dialogue created specifically for video ga…☆42Updated 2 years ago
- Towards Robust Blind Face Restoration with Codebook Lookup Transformer☆33Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆102Updated last year
- VoiceBox neural network implementation☆110Updated last year