Manage audio and video datasets
☆35Apr 16, 2026Updated 2 weeks ago
Alternatives and similar repositories for audb
Users that are interested in audb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Format to store media files and annotations☆12Apr 15, 2026Updated 2 weeks ago
- Handling audio files in Python☆39Apr 15, 2026Updated 2 weeks ago
- Machine learning speaker characteristics☆44Updated this week
- German prenames as CSV data☆13Mar 6, 2018Updated 8 years ago
- A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks.☆23Mar 27, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Gamera 4 for Python 3☆14May 16, 2025Updated 11 months ago
- label and annotate large number of speech data files☆12May 5, 2021Updated 4 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated 2 months ago
- GPT for FACodec☆13Mar 25, 2024Updated 2 years ago
- ☆13Oct 11, 2024Updated last year
- ☆13Feb 8, 2017Updated 9 years ago
- Emofilt is a program to simulate emotional arousal with speech synthesis based on the free-for-non-commercial-use MBROLA synthesis engine…☆14Mar 17, 2022Updated 4 years ago
- Supervoice diffusion enhance☆28Jul 15, 2024Updated last year
- Repository for speech paper reading☆33Aug 19, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- openXBOW - the Passau Open-Source Crossmodal Bag-of-Words Toolkit☆84Feb 17, 2021Updated 5 years ago
- NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.☆18Apr 7, 2025Updated last year
- Score-aligned loudness, beat, and expressive markings data for 2000 Chopin Mazurka recordings☆14Jul 6, 2023Updated 2 years ago
- ☆19Mar 22, 2024Updated 2 years ago
- High-performance, semantic turn detection for conversational AI☆37Oct 1, 2025Updated 7 months ago
- A Python package for estimating the impact of features on ML models☆14May 18, 2023Updated 2 years ago
- ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)☆10Mar 9, 2024Updated 2 years ago
- melodic object transcription framework☆26Nov 15, 2017Updated 8 years ago
- The Munich Open-Source Large-Scale Multimedia Feature Extractor☆808Jan 26, 2026Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Dec 12, 2024Updated last year
- ☆16Dec 18, 2023Updated 2 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- ☆157Jan 24, 2021Updated 5 years ago
- ☆36Feb 23, 2017Updated 9 years ago
- Tool to analyze an audio corpora in terms of intonation, intensity, duration and voice quality☆23Jun 17, 2019Updated 6 years ago
- The SEILS Dataset☆17Oct 24, 2021Updated 4 years ago
- Split and stitch AAC without the wait!☆29Apr 17, 2024Updated 2 years ago
- DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code☆10Mar 8, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆26Nov 4, 2023Updated 2 years ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆93Mar 12, 2025Updated last year
- Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model☆10Aug 24, 2025Updated 8 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Oct 23, 2025Updated 6 months ago
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 3 years ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Dec 16, 2022Updated 3 years ago
- Example of application of genetic algorithm for evolution kart navigation.☆11Nov 21, 2019Updated 6 years ago