thuhcsi / dpss-exp3-VC-BNF
Voice Conversion Experiments for THUHCSI Course : <Digital Processing of Speech Signals>
β8Updated last year
Related projects: β
- [Official Implementation] Acoustic Autoregressive Modeling π₯β52Updated 3 weeks ago
- β33Updated 2 months ago
- π¦ Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)β28Updated 3 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformersβ68Updated 2 months ago
- β22Updated 5 months ago
- β10Updated this week
- β21Updated this week
- Pytorch implementation for βV2C: Visual Voice Cloningββ30Updated last year
- [ECCV 2024 Oral] Audio-Synchronized Visual Animationβ23Updated last week
- Source code for the paper 'Audio Captioning Transformer'β47Updated 2 years ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.β30Updated 7 months ago
- β27Updated 9 months ago
- β23Updated last month
- Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)β9Updated 2 years ago
- Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).β61Updated 2 months ago
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers aβ¦β60Updated 5 months ago
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by Inβ¦β42Updated 5 months ago
- β35Updated last year
- Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTSβ31Updated last year
- Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023β82Updated 11 months ago
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".β55Updated last month
- DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official codeβ10Updated 2 years ago
- β33Updated 5 months ago
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".β22Updated 10 months ago
- β26Updated 3 months ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)β48Updated 3 months ago
- official code for CVPR'24 paper Diff-BGMβ38Updated 5 months ago
- Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignmentβ62Updated 2 months ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"β32Updated last month
- ESLTTS datasetβ15Updated 3 months ago