apple / dmel-demoLinks
dMel: Speech Tokenization Made Simple
☆13Updated 3 weeks ago
Alternatives and similar repositories for dmel-demo
Users that are interested in dmel-demo are comparing it to the libraries listed below
Sorting:
- Declare your datasets and download them using a simple tool☆10Updated 10 months ago
- ☆20Updated 2 months ago
- Training hybrid models for dummies.☆21Updated 4 months ago
- ☆13Updated 6 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆20Updated 7 months ago
- A small rust-based data loader☆24Updated 5 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 11 months ago
- ☆37Updated last month
- GoldFinch and other hybrid transformer components☆10Updated 3 weeks ago
- Rust crate for some audio utilities☆23Updated 2 months ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆12Updated 5 months ago
- Acoustic Neighbor Embeddings☆23Updated 5 months ago
- ☆10Updated last year
- ☆40Updated 3 months ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆19Updated 7 months ago
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆21Updated 4 months ago
- Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"☆38Updated last week
- Audiogen Codec☆137Updated 10 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆98Updated 7 months ago
- Proof of concept for running moshi/hibiki using webrtc☆19Updated 3 months ago
- GPT-style network for phonemization with durations of text☆66Updated last year
- Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits. It handles checkpo…☆112Updated last year
- python bindings for symphonia/opus - read various audio formats from python and write opus files☆64Updated last month
- ☆13Updated 9 months ago
- Basic Denoising Diffusion Probabilistic Model image generator implemented in PyTorch☆10Updated 5 months ago
- StyleTTS 2 Optimized Training Fork☆29Updated 4 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆35Updated 2 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆137Updated 11 months ago
- IPA Phonemizer/Dephonemizer for 139 human languages☆27Updated last month