Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"
☆32Mar 4, 2025Updated last year
Alternatives and similar repositories for Baton
Users that are interested in Baton are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official implementation of work "AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward".☆18Mar 25, 2025Updated last year
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- [ACMMM 2024] Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors☆25Oct 22, 2024Updated last year
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".☆121Sep 14, 2024Updated last year
- Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models☆43Mar 17, 2026Updated last week
- ☆51Updated this week
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- small audio language model for reasoning☆85Dec 4, 2025Updated 3 months ago
- ☆45Jun 11, 2024Updated last year
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆13Jun 15, 2024Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆114Jan 28, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Query-conditioned target sound extraction model☆30Mar 25, 2025Updated last year
- ICML2025☆65Aug 28, 2025Updated 7 months ago
- ☆44Apr 2, 2025Updated 11 months ago
- ☆43Jan 13, 2025Updated last year
- Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"☆87Sep 18, 2025Updated 6 months ago
- ☆18May 4, 2025Updated 10 months ago
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- ☆56Jul 13, 2025Updated 8 months ago
- Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"☆17May 19, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆39Nov 20, 2024Updated last year
- TG-CRITIC: A TIMBRE-GUIDED MODEL FOR REFERENCE-INDEPENDENT SINGING EVALUATION☆15May 26, 2023Updated 2 years ago
- ☆10Sep 29, 2015Updated 10 years ago
- ☆10Sep 25, 2024Updated last year
- Code for "A diffusion-inspired training strategy for singing voice extraction in the waveform domain" (ISMIR 2022)☆17Feb 16, 2023Updated 3 years ago
- Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation☆70Jan 22, 2026Updated 2 months ago
- Source code for the paper 'Audio Captioning Transformer'☆56Jan 18, 2022Updated 4 years ago
- ☆10Oct 16, 2025Updated 5 months ago
- ☆18Aug 23, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models☆28Dec 14, 2023Updated 2 years ago
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆33Nov 9, 2025Updated 4 months ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- ☆13Sep 12, 2024Updated last year
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆330Dec 17, 2025Updated 3 months ago
- Fine-tune Stable Audio Open with DiT ControlNet.☆249May 16, 2025Updated 10 months ago