Manage audio and video datasets
☆33Feb 23, 2026Updated last week
Alternatives and similar repositories for audb
Users that are interested in audb are comparing it to the libraries listed below
Sorting:
- Format to store media files and annotations☆12Feb 23, 2026Updated last week
- Handling audio files in Python☆39Feb 12, 2026Updated 2 weeks ago
- NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.☆18Apr 7, 2025Updated 10 months ago
- ☆13Oct 11, 2024Updated last year
- Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model☆10Aug 24, 2025Updated 6 months ago
- Web Presentation of the Lichtfeld Studio☆19Feb 15, 2026Updated 2 weeks ago
- High-performance, semantic turn detection for conversational AI☆34Oct 1, 2025Updated 5 months ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Updated this week
- Native app social interactions on mobile and tablet web☆16Jan 28, 2014Updated 12 years ago
- German prenames as CSV data☆11Mar 6, 2018Updated 7 years ago
- GPT for FACodec☆13Mar 25, 2024Updated last year
- Machine learning speaker characteristics☆42Updated this week
- A Python package for estimating the impact of features on ML models☆14May 18, 2023Updated 2 years ago
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 6 months ago
- ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)☆10Mar 9, 2024Updated last year
- Repository for speech paper reading☆33Aug 19, 2021Updated 4 years ago
- ☆16Dec 18, 2023Updated 2 years ago
- Benchmarking LLMs as Casual Card Game AIs☆20Jan 22, 2025Updated last year
- Voice conversion with just linear regression.☆35Sep 25, 2025Updated 5 months ago
- ☆19Mar 22, 2024Updated last year
- ☆20Sep 2, 2024Updated last year
- melodic object transcription framework☆26Nov 15, 2017Updated 8 years ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Oct 23, 2025Updated 4 months ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆93Mar 12, 2025Updated 11 months ago
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆24Jan 9, 2024Updated 2 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- Collaborative audio annotation tool☆18Sep 16, 2022Updated 3 years ago
- Temporary anonymous version☆22Mar 20, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Supervoice diffusion enhance☆28Jul 15, 2024Updated last year
- ☆60Jan 8, 2025Updated last year
- The Munich Open-Source Large-Scale Multimedia Feature Extractor☆783Jan 26, 2026Updated last month
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 8 months ago
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 2 years ago
- PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis☆55Oct 15, 2021Updated 4 years ago
- ☆37Sep 21, 2025Updated 5 months ago
- ☆24Mar 30, 2024Updated last year
- ☆25Aug 31, 2024Updated last year