NZqian / RapBankLinks
☆69Updated 10 months ago
Alternatives and similar repositories for RapBank
Users that are interested in RapBank are comparing it to the libraries listed below
Sorting:
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆169Updated last year
- Github repository for ACL 2025 paper: Recent Advances in Speech Language Models: A Survey.☆121Updated last month
- CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages [ACL 2025]☆175Updated 2 months ago
- OpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model☆62Updated 2 weeks ago
- A curated list of Video to Audio Generation☆63Updated last month
- [ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation☆248Updated 3 weeks ago
- JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment☆31Updated this week
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆105Updated 2 months ago
- ☆285Updated 3 months ago
- ☆85Updated last month
- ☆61Updated last month
- We Speech Transcript based on LLM, in 300 lines of code.☆174Updated last month
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆229Updated 6 months ago
- ☆27Updated 3 weeks ago
- A Foundation Model for Industrial Signal Comprehensive Representation☆18Updated last week
- flow mirror models from JZX AI Labs☆44Updated 10 months ago
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆76Updated last month
- An easy-to-use, fast, and easily integrable tool for evaluating audio LLM☆128Updated this week
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆207Updated 5 months ago
- Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).☆110Updated 6 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆75Updated 9 months ago
- Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation☆59Updated 2 weeks ago
- official code for CVPR'24 paper Diff-BGM☆66Updated 9 months ago
- ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model☆210Updated last year
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆101Updated 7 months ago
- AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model☆69Updated 4 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆95Updated 7 months ago
- ☆40Updated 5 months ago
- Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆114Updated this week
- Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).☆127Updated last year