[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
Alternatives and similar repositories for Right2Talk
Users that are interested in Right2Talk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆29Jun 15, 2022Updated 3 years ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Jul 11, 2022Updated 3 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"☆33Apr 11, 2022Updated 3 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- wake-up word emotion recognition [APSIPA 2022]☆17Nov 11, 2022Updated 3 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code☆109May 1, 2022Updated 3 years ago
- ☆11May 7, 2022Updated 3 years ago
- Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation☆26Nov 24, 2021Updated 4 years ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆24Sep 27, 2022Updated 3 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 4 months ago
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago
- python wrap for hts engine☆14Jan 30, 2018Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆13Oct 27, 2021Updated 4 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago
- A CSRankings-like index for speech researchers☆35Oct 16, 2024Updated last year
- Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"☆17Feb 11, 2023Updated 3 years ago
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Oct 8, 2023Updated 2 years ago
- ICASSP 2020 ESPnet-TTS: Merlin baseline system☆36Oct 28, 2019Updated 6 years ago
- ☆24Mar 30, 2024Updated last year
- Audio-Visual Speech Separation with Cross-Modal Consistency☆247Jul 25, 2023Updated 2 years ago
- Code for ICASSP 2019 paper☆18Oct 29, 2018Updated 7 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)☆10Oct 11, 2021Updated 4 years ago
- Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.☆120Mar 18, 2023Updated 3 years ago
- Repo for Visual Acoustic Matching, CVPR 2022☆70Feb 28, 2023Updated 3 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ☆15May 8, 2021Updated 4 years ago
- Talking Head from Speech Audio using a Pre-trained Image Generator☆23May 7, 2024Updated last year
- ☆15Nov 5, 2021Updated 4 years ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 3 years ago
- Generative Adversarial Networks for different impaired speech conversions☆39Jul 6, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- wake word spotting with kaldi☆19Dec 3, 2020Updated 5 years ago
- Scripts for training Kaldi for German speech recognition (ASR).☆27Feb 11, 2021Updated 5 years ago
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆134Dec 4, 2023Updated 2 years ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆90Dec 20, 2024Updated last year
- ☆13Aug 13, 2023Updated 2 years ago