Haoqiu-Yan / PerceptiveAgent
Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))
☆28Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for PerceptiveAgent
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆33Updated last week
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆65Updated 7 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆40Updated last week
- Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis☆37Updated last year
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆36Updated 5 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆49Updated last month
- SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer☆90Updated this week
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆81Updated 2 weeks ago
- ☆34Updated 6 months ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆36Updated last year
- Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)☆26Updated last year
- Official release of StyleTalk dataset.☆57Updated 4 months ago
- Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS☆35Updated last year
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆25Updated last week
- ☆76Updated 2 months ago
- All generative model in one for better TTS model☆66Updated 2 months ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆25Updated 7 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆62Updated last week
- ☆43Updated 4 months ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆66Updated 11 months ago
- Official Code for ParrotTTS☆41Updated 3 weeks ago
- ☆25Updated 3 months ago
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆23Updated 4 months ago
- ☆19Updated 2 months ago
- ☆33Updated last year
- Codebase and project page for EDMSound☆29Updated 11 months ago
- trying to reproduce suno v3☆25Updated 7 months ago
- The open source code for LLM-Codec☆114Updated 2 months ago