cvlab-columbia / voicecamo
Code for the paper Real-Time Neural Voice Camouflage
☆28Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for voicecamo
- Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis☆37Updated last year
- Codebase and project page for EDMSound☆29Updated last year
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆77Updated 5 months ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆109Updated last year
- Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS☆35Updated last year
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Updated 3 years ago
- ☆48Updated last year
- Official Code Implementation for 'A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models'☆13Updated 3 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆66Updated last week
- Pytorch implementation for “V2C: Visual Voice Cloning”☆30Updated last year
- ☆14Updated last year
- Code and pre-trained model release for the ICASSP 2023 Paper "NORD NON-MATCHING REFERENCE BASED RELATIVE DEPTH ESTIMATION FROM BINAURAL A…☆11Updated 6 months ago
- ☆11Updated 4 months ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆42Updated 2 months ago
- ☆81Updated 2 months ago
- ☆45Updated last year
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆42Updated last year
- Facestar dataset. High quality audio-visual recordings of human conversational speech.☆104Updated 2 years ago
- INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues☆46Updated last year
- Anim-400K: A dataset designed from the ground up for automated dubbing of video☆99Updated 5 months ago
- ☆13Updated 11 months ago
- Training code and trained checkpoints for ASGAN.☆60Updated 10 months ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆50Updated last week
- Transcribing Speech with Multinomial Diffusion, training code and models.☆75Updated last year
- ☆26Updated last year
- GPT for FACodec☆13Updated 7 months ago
- Demo for 2022 ICASSP☆64Updated 2 years ago
- Implementation of NWT, audio-to-video generation, in Pytorch☆87Updated 2 years ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆51Updated 9 months ago