clement-pages / gryannoteLinks
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
ā63Updated last month
Alternatives and similar repositories for gryannote
Users that are interested in gryannote are comparing it to the libraries listed below
Sorting:
- Open TTS models, built for streaming on the edgeā43Updated 4 months ago
- šļø Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets āØā90Updated last month
- Collection of Open Source Speech Dataā159Updated 8 months ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.ā20Updated 4 months ago
- VoiceBox neural network implementationā108Updated 11 months ago
- Audio tokenization, in the fastest way possible!ā52Updated 10 months ago
- Speaker Diarization with Transformersā68Updated last month
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,ā¦ā74Updated 9 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPā¦ā100Updated 9 months ago
- š¼ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decompositionā15Updated last year
- ā62Updated 11 months ago
- ā52Updated last week
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.ā41Updated 2 weeks ago
- Joint speech-language model - respond directly to audio!ā30Updated last year
- Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"ā100Updated last month
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.ā17Updated 7 months ago
- ā85Updated last year
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"ā38Updated this week
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.ā88Updated last year
- Official implementation for FlowSepā54Updated 6 months ago
- Google's SoundStorm: Efficient Parallel Audio Generationā132Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.ā95Updated 6 months ago
- Add n-gram and large language model (LLM) support to Whisper models.ā29Updated 2 months ago
- A TTS model that makes a speaker speak new languagesā76Updated last year
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximationā115Updated last month
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.ā82Updated 8 months ago
- SoTA open-source TTSā63Updated last month
- A TTS model capable of generating ultra-realistic dialogue in one pass.ā109Updated last month
- ā75Updated last month
- The official implementation of "A Language Modeling Approach to Diacritic-Free Hebrew TTS"ā100Updated last month