elevenlabs / opuspy
Opus codec support for Python.
β27Updated 2 years ago
Alternatives and similar repositories for opuspy:
Users that are interested in opuspy are comparing it to the libraries listed below
- Unofficial implementation of wavenext vocoderβ44Updated 7 months ago
- Heteronym to Phoneme Parserβ18Updated last year
- π« check your data, before you wreck your modelβ16Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β95Updated 5 months ago
- (WIP) A retrain of F5-TTS on permissively-licensed dataβ9Updated 2 weeks ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.β31Updated last year
- Convert English text from written expressions into spoken formsβ24Updated 2 years ago
- StyleTTS 2 Optimized Training Forkβ26Updated last month
- β59Updated last year
- Simple PyTorch Denoisers for Waveform Audioβ35Updated last month
- Unofficial implementation of ConvNeXt-TTS powered by lightningβ15Updated 5 months ago
- AudioSR-Upsampling (any -> 48kHz)β40Updated last year
- Multispeaker Community Vocoder Model for DiffSingerβ35Updated 10 months ago
- Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementaβ¦β20Updated last year
- Supervoice diffusion enhanceβ26Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β61Updated 2 weeks ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ131Updated last year
- Trying to build an all in one speech-text language model - a bit like GPT-4oβ22Updated 10 months ago
- A curated list of awesome voice activity detectionβ45Updated 4 months ago
- Monotonic Alignment Searchβ90Updated 2 years ago
- VoiceBox neural network implementationβ105Updated 8 months ago
- Acoustic models for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversionβ102Updated last year
- The TTSDS benchmark evaluates synthetic speech quality by considering prosody, speaker identity, and intelligibility, comparing these facβ¦β30Updated 4 months ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]β24Updated 2 years ago
- Implementation of Emo-StarGANβ45Updated last year
- Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancementβ22Updated last year
- A fast python library for aligning similar audio snippets passed in as NumPy arraysβ44Updated 2 weeks ago
- Finetuning VITS Efficientlyβ32Updated last year
- Codebase and project page for EDMSoundβ34Updated last year
- A TTS model that makes a speaker speak new languagesβ76Updated 9 months ago