i4Ds / whisper-finetune
This repository contains code for fine-tuning the Whisper speech-to-text model.
☆8Updated 2 months ago
Alternatives and similar repositories for whisper-finetune:
Users that are interested in whisper-finetune are comparing it to the libraries listed below
- Zero-Shot Emotion Style Transfer☆45Updated this week
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆69Updated 6 months ago
- ☆24Updated last year
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆17Updated 5 months ago
- ☆38Updated 7 months ago
- ☆50Updated 3 weeks ago
- ☆27Updated 3 weeks ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆101Updated 3 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆86Updated 4 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆16Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆28Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆81Updated last year
- a Frontier Japanese Speech Generation net☆31Updated last month
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆38Updated 2 weeks ago
- ☆59Updated last week
- TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching☆48Updated this week
- Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model☆20Updated 2 weeks ago
- VoiceBox neural network implementation☆106Updated 8 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 6 months ago
- Official Code for ParrotTTS☆48Updated 6 months ago
- SelfRemaster: SSL Speech Restoration☆88Updated last year
- The official implementation of EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector (TAFFC 20…☆85Updated last week
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆52Updated 6 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆65Updated 5 months ago
- Implementation of Emo-StarGAN☆45Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆87Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 2 weeks ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- ☆26Updated 2 months ago
- Collection of scripts from mHuBERT-147.☆24Updated 5 months ago