Manage audio and video datasets
☆35Mar 4, 2026Updated 2 weeks ago
Alternatives and similar repositories for audb
Users that are interested in audb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Format to store media files and annotations☆12Mar 13, 2026Updated last week
- Handling audio files in Python☆39Feb 12, 2026Updated last month
- Machine learning speaker characteristics☆43Updated this week
- German prenames as CSV data☆11Mar 6, 2018Updated 8 years ago
- Gamera 4 for Python 3☆14May 16, 2025Updated 10 months ago
- label and annotate large number of speech data files☆12May 5, 2021Updated 4 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated 3 weeks ago
- GPT for FACodec☆13Mar 25, 2024Updated last year
- ☆13Oct 11, 2024Updated last year
- Supervoice diffusion enhance☆28Jul 15, 2024Updated last year
- ☆13Feb 8, 2017Updated 9 years ago
- Emofilt is a program to simulate emotional arousal with speech synthesis based on the free-for-non-commercial-use MBROLA synthesis engine…☆14Mar 17, 2022Updated 4 years ago
- openXBOW - the Passau Open-Source Crossmodal Bag-of-Words Toolkit☆84Feb 17, 2021Updated 5 years ago
- NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.☆18Apr 7, 2025Updated 11 months ago
- Welcome to the Real-Time Voice Activity Detection (VAD) program, powered by Silero-VAD model! 🚀 This program allows you to perform live …☆12Jul 9, 2023Updated 2 years ago
- ☆15Nov 3, 2020Updated 5 years ago
- ☆19Mar 22, 2024Updated 2 years ago
- Score-aligned loudness, beat, and expressive markings data for 2000 Chopin Mazurka recordings☆14Jul 6, 2023Updated 2 years ago
- High-performance, semantic turn detection for conversational AI☆35Oct 1, 2025Updated 5 months ago
- A Python package for estimating the impact of features on ML models☆14May 18, 2023Updated 2 years ago
- ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)☆10Mar 9, 2024Updated 2 years ago
- This repository is developed in MATLAB. Speech Augmentation is based on Adaptive Filtering while Endpoint Detection is based on Voice Act…☆10Dec 7, 2020Updated 5 years ago
- melodic object transcription framework☆26Nov 15, 2017Updated 8 years ago
- The Munich Open-Source Large-Scale Multimedia Feature Extractor☆787Jan 26, 2026Updated last month
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- ☆16Dec 18, 2023Updated 2 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- ☆36Feb 23, 2017Updated 9 years ago
- Tool to analyze an audio corpora in terms of intonation, intensity, duration and voice quality☆23Jun 17, 2019Updated 6 years ago
- The SEILS Dataset☆16Oct 24, 2021Updated 4 years ago
- DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code☆10Mar 8, 2022Updated 4 years ago
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆26Nov 4, 2023Updated 2 years ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆93Mar 12, 2025Updated last year
- Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model☆10Aug 24, 2025Updated 6 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Oct 23, 2025Updated 5 months ago
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 2 years ago
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 7 months ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Dec 16, 2022Updated 3 years ago
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 7 months ago