ProjectEGU / whisper-for-low-vram
Robust Speech Recognition via Large-Scale Weak Supervision
☆30Updated last year
Alternatives and similar repositories for whisper-for-low-vram:
Users that are interested in whisper-for-low-vram are comparing it to the libraries listed below
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Updated 2 years ago
- A testing repo to share code and thoughts on diarisation☆53Updated 11 months ago
- Misc. tools/scripts that I made to use for tortoise☆21Updated 7 months ago
- StyleTTS 2 Optimized Training Fork☆26Updated last month
- Your one-stop solution for voice dataset creation☆117Updated last year
- Split long audio files based on subtitle-info in SRT File (Transcript saved in CSV)☆20Updated 5 years ago
- Coqui AI TTS plugin☆74Updated last week
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆26Updated last year
- Whisper combined with Silero VAD, for improved long-form transcriptions☆47Updated 2 years ago
- Synchronize Whisper's timestamps over an existing accurate transcription☆142Updated 9 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- ☆95Updated 10 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆60Updated last week
- liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project☆33Updated last year
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆27Updated 2 years ago
- GradioUI for TortoiseTTS voice generation☆34Updated last year
- Create training data for training a voice cloner for bark text to speech.☆43Updated last year
- List of repositories relevant to VITS.☆36Updated 2 years ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 10 months ago
- a Frontier Japanese Speech Generation net☆27Updated last week
- Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model☆10Updated 2 years ago
- ☆82Updated 8 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆32Updated 2 weeks ago
- Retrieval-based Voice Conversion (RVC) implemented with Hugging Face Transformers.☆67Updated last year
- Heteronym to Phoneme Parser☆18Updated last year
- create dataset from list of youtube links easily☆17Updated last year
- Finetuning VITS Efficiently☆32Updated last year
- Community framework for training tortoise☆41Updated 2 years ago