AMAAI-Lab/DART

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AMAAI-Lab/DART)

AMAAI-Lab / DART

Demo for DART, Audio Imagination workshop submission in NeurIPS 2024

☆16

Alternatives and similar repositories for DART

Users that are interested in DART are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AMAAI-Lab / nnAudio2
View on GitHub
GPU-based audio processing. Trainable Fourier kernels.
☆35Jun 8, 2026Updated last month
kamperh / linearvc
View on GitHub
Voice conversion with just linear regression.
☆37Sep 25, 2025Updated 9 months ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
Dapwner / CVAE-Tacotron
View on GitHub
☆26Jun 5, 2024Updated 2 years ago
P1ping / TokAN-Legacy
View on GitHub
☆27Jun 22, 2026Updated last month
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
mbzuai-nlp / sttatts
View on GitHub
☆31Oct 29, 2024Updated last year
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆61Nov 1, 2024Updated last year
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆70Nov 1, 2024Updated last year
AMAAI-Lab / t2m-inferalign
View on GitHub
Improving Symbolic Music Generation with Inference-Time Alignment
☆22Aug 2, 2025Updated 11 months ago
hs-oh-prml / DurFlexEVC
View on GitHub
☆81Jan 22, 2025Updated last year
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated last year
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
ahmedshah1494 / speech_robust_bench
View on GitHub
☆18Apr 24, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
unilight / seq2seq-vc
View on GitHub
A sequence-to-sequence voice conversion toolkit.
☆113Mar 15, 2026Updated 4 months ago
PecholaL / MAIN-VC
View on GitHub
Lightweight Speech Representation Learning for One-Shot Voice Conversion
☆23Dec 12, 2024Updated last year
malradhi / PACodec
View on GitHub
[ICASSP 2026]Official code for "Prosody-Guided Harmonic Attention for Phase-Coherent Neural Vocoding in the Complex Spectrum"
☆27Jan 22, 2026Updated 6 months ago
Many0therFunctions / MaskGCT-Text-To-Semantic-Finetune
View on GitHub
This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …
☆13Dec 4, 2024Updated last year
line / promptttspp
View on GitHub
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
☆86Oct 11, 2024Updated last year
ajd12342 / paraspeechcaps
View on GitHub
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆162Mar 26, 2026Updated 3 months ago
KTTRCDL / UMETTS
View on GitHub
UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts
☆41Jun 12, 2025Updated last year
yjzxkxdn / Mini-DDSP
View on GitHub
☆16Mar 31, 2025Updated last year
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago
winddori2002 / DEX-TTS
View on GitHub
DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability
☆108Jan 17, 2025Updated last year
zhenye234 / FlashSpeech
View on GitHub
ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis
☆155Sep 20, 2024Updated last year
jzmzhong / Automatic-Prosody-Annotator-with-SSWP-CLAP
View on GitHub
An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).
☆50Jun 11, 2024Updated 2 years ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆70Sep 22, 2025Updated 10 months ago
lakahaga / dc-comix-tts
View on GitHub
Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
☆74Aug 21, 2023Updated 2 years ago
juice500ml / xlm_to_xlsr
View on GitHub
Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)
☆12Mar 12, 2024Updated 2 years ago
YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AMAAI-Lab / JamendoMaxCaps
View on GitHub
JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks
☆53May 24, 2025Updated last year
ZehuaKcrissLi / GTR-Voice
View on GitHub
☆16Nov 11, 2024Updated last year
Choddeok / EmoSpherepp
View on GitHub
[TAFFC 2025] The official implementation of EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vec…
☆129Updated this week
Audio-Foundation-Models / ConversationTTS
View on GitHub
☆101Jan 19, 2026Updated 6 months ago
declare-lab / nora-1.5
View on GitHub
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
☆109Jan 11, 2026Updated 6 months ago
ictnlp / ComSpeech
View on GitHub
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
☆27Jul 2, 2024Updated 2 years ago
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year