Hannieliao / BatonView external linksLinks
Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"
☆32Mar 4, 2025Updated 11 months ago
Alternatives and similar repositories for Baton
Users that are interested in Baton are comparing it to the libraries listed below
Sorting:
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models☆43Mar 3, 2025Updated 11 months ago
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆14Aug 22, 2023Updated 2 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- ☆18May 4, 2025Updated 9 months ago
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆23Jun 6, 2025Updated 8 months ago
- ☆40Apr 2, 2025Updated 10 months ago
- ☆49Apr 13, 2025Updated 10 months ago
- music semantic understanding evaluation benchmark☆25Aug 12, 2023Updated 2 years ago
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆32Nov 9, 2025Updated 3 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆113Jan 28, 2026Updated 2 weeks ago
- small audio language model for reasoning☆86Dec 4, 2025Updated 2 months ago
- ☆45Jun 11, 2024Updated last year
- ☆56Jul 13, 2025Updated 7 months ago
- Query-conditioned target sound extraction model☆30Mar 25, 2025Updated 10 months ago
- Codes and MIDI demos of ISMIR 2022 paper: Domain Adversarial Training on Conditional Variational Auto-Encoder for Controllable Music Gene…☆21Mar 28, 2023Updated 2 years ago
- offical code for Dense-TSNet☆12Sep 17, 2024Updated last year
- ☆10Oct 16, 2025Updated 3 months ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- ☆11Sep 25, 2024Updated last year
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"☆83Sep 18, 2025Updated 4 months ago
- ☆43Jan 13, 2025Updated last year
- Music Generative Pretrained Transformer☆27Aug 23, 2022Updated 3 years ago
- [AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models☆27Dec 14, 2023Updated 2 years ago
- Official repository of Myna: Masking-Based Contrastive Learning of Musical Representations☆17Mar 31, 2025Updated 10 months ago
- CCMusic, an open Chinese music database, integrates diverse datasets. It ensures data consistency via cleaning, label refinement and stru…☆26Oct 31, 2025Updated 3 months ago
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆13Jun 15, 2024Updated last year
- Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 20…☆16Sep 29, 2025Updated 4 months ago
- Art2Mus is a system that generates music based on digitized artworks and text by using the AudioLDM2 architecture with an added projectio…☆19Oct 20, 2025Updated 3 months ago
- 基于PC-DDSP和nsf-HiFiGAN的声码器☆18Jul 17, 2023Updated 2 years ago
- ☆10Sep 29, 2015Updated 10 years ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- TG-CRITIC: A TIMBRE-GUIDED MODEL FOR REFERENCE-INDEPENDENT SINGING EVALUATION☆15May 26, 2023Updated 2 years ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 7 months ago
- A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …☆94Jun 12, 2025Updated 8 months ago