Yuan-ManX / audio-ai-agentView external linksLinks
Here we will track the latest Audio AI Agent, including speech, music, sound effects, etc.
☆16Dec 8, 2023Updated 2 years ago
Alternatives and similar repositories for audio-ai-agent
Users that are interested in audio-ai-agent are comparing it to the libraries listed below
Sorting:
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 7 months ago
- Basic library for spatial audio SOFA files☆12Sep 29, 2020Updated 5 years ago
- [DEPRECIATED] [PyTorch 2.0] [638M] [85.33% acc] Full-attention multi-instrumental music transformer for supervised music generation, opti…☆32Nov 23, 2023Updated 2 years ago
- Prediction of sound event bounding boxes (SEBBs)☆32Aug 2, 2024Updated last year
- A DIY head tracker for 3D audio production☆18Mar 20, 2023Updated 2 years ago
- ☆23Feb 2, 2022Updated 4 years ago
- ☆37Jul 4, 2024Updated last year
- SouPyX: An Audio Exploration Space.🪐☆42Nov 28, 2023Updated 2 years ago
- Event Relation in Text-to-Audio (TTA) Generation☆20Feb 26, 2025Updated 11 months ago
- Ableton MIDI-Clip generation using GPT-4☆45Jan 14, 2026Updated last month
- 60k hours of phoneme-aligned audio from audio books☆19Jul 27, 2024Updated last year
- Standalone real time dynamic vocal harmonizer☆25Nov 28, 2023Updated 2 years ago
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆30Jan 13, 2026Updated last month
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- A toolkit dedicate for speech evaluation.☆24Sep 26, 2024Updated last year
- Various plugins created for Wwise☆25Jul 15, 2019Updated 6 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- An AI that generates harsh song reviews☆24Feb 25, 2024Updated last year
- Musical Agent based on Self-Organizing Maps☆23Feb 6, 2023Updated 3 years ago
- Prosody and Pronunciation Modification Network☆62May 5, 2025Updated 9 months ago
- Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"☆107Dec 20, 2025Updated last month
- ☆59Oct 22, 2025Updated 3 months ago
- A Jupyter book accompanying the ISMIR 2023 tutorial Introduction to DIfferentiable Audio Synthesiser Programming☆62Jun 30, 2025Updated 7 months ago
- Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"☆41Jun 28, 2025Updated 7 months ago
- ☆25Jan 2, 2024Updated 2 years ago
- A parser for annotated MuseScore 3 files.☆53Mar 25, 2025Updated 10 months ago
- [ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.☆119Oct 17, 2025Updated 3 months ago
- "Fx-Encoder++: Extracting Instrument-wise Audio Effect Representations from Mixtures"☆47Aug 23, 2025Updated 5 months ago
- mp3 as VST-effect☆59Oct 20, 2024Updated last year
- Llasa Speed Up☆57Jan 18, 2026Updated 3 weeks ago
- A list of free audio and MIDI plugins for music production☆33Jun 29, 2022Updated 3 years ago
- This is the official implementation of MusER (AAAI'24).☆30Jun 4, 2025Updated 8 months ago
- A JUCE module that wraps the resvg SVG rendering library in a JUCE compatible interface☆30Oct 21, 2025Updated 3 months ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 6 months ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated last year
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆123Sep 2, 2025Updated 5 months ago
- Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"☆109Oct 16, 2025Updated 4 months ago