cvlab-columbia / voicecamoLinks
Code for the paper Real-Time Neural Voice Camouflage
☆28Updated 3 years ago
Alternatives and similar repositories for voicecamo
Users that are interested in voicecamo are comparing it to the libraries listed below
Sorting:
- Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis☆39Updated last year
- Facestar dataset. High quality audio-visual recordings of human conversational speech.☆109Updated 3 years ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆115Updated 2 years ago
- Demo for 2022 ICASSP☆64Updated 3 years ago
- Training code and trained checkpoints for ASGAN.☆62Updated last year
- Unsupervised Rhythm Modeling for Voice Conversion☆84Updated 2 years ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆85Updated last year
- ☆20Updated 3 years ago
- The demo page of UniAudio☆34Updated last year
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Updated 2 years ago
- An AR+AR TTS attempt.☆16Updated 7 months ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Updated 4 years ago
- GPT for FACodec☆13Updated last year
- Temporary anonymous version☆22Updated last year
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Updated 2 years ago
- GPT-style network for phonemization with durations of text☆67Updated last year
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Updated 8 months ago
- Codebase and project page for EDMSound☆34Updated last year
- A spoken version of the textual story cloze benchmark☆18Updated 2 years ago
- The project page repo for Neural Dubber.☆30Updated last year
- ☆60Updated last year
- ☆23Updated 2 years ago
- Official implementation of MelHuBERT☆66Updated 9 months ago
- ESLTTS dataset☆16Updated 6 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆78Updated last year
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆88Updated 10 months ago
- Official release of StyleTalk dataset.☆69Updated last year
- Implementation of Emo-StarGAN☆45Updated last year
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆24Updated last year
- VoiceLDM: Text-to-Speech with Environmental Context☆182Updated last year