[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
Alternatives and similar repositories for Right2Talk
Users that are interested in Right2Talk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆29Jun 15, 2022Updated 3 years ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Jul 11, 2022Updated 3 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"☆33Apr 11, 2022Updated 4 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- wake-up word emotion recognition [APSIPA 2022]☆17Nov 11, 2022Updated 3 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code☆109May 1, 2022Updated 3 years ago
- ☆11May 7, 2022Updated 3 years ago
- Public release of the Sound Effect Foundation model by Sony AI.☆79Apr 7, 2026Updated last week
- Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation☆26Nov 24, 2021Updated 4 years ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆24Sep 27, 2022Updated 3 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 5 months ago
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- python wrap for hts engine☆14Jan 30, 2018Updated 8 years ago
- ☆13Oct 27, 2021Updated 4 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago
- A CSRankings-like index for speech researchers☆35Oct 16, 2024Updated last year
- Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"☆17Feb 11, 2023Updated 3 years ago
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Oct 8, 2023Updated 2 years ago
- ICASSP 2020 ESPnet-TTS: Merlin baseline system☆36Oct 28, 2019Updated 6 years ago
- ☆24Mar 30, 2024Updated 2 years ago
- Audio-Visual Speech Separation with Cross-Modal Consistency☆248Jul 25, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for ICASSP 2019 paper☆18Oct 29, 2018Updated 7 years ago
- Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization (ICCV 2021)☆10Oct 11, 2021Updated 4 years ago
- Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.☆120Mar 18, 2023Updated 3 years ago
- Repo for Visual Acoustic Matching, CVPR 2022☆70Feb 28, 2023Updated 3 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ☆15May 8, 2021Updated 4 years ago
- Talking Head from Speech Audio using a Pre-trained Image Generator☆23May 7, 2024Updated last year
- ☆15Nov 5, 2021Updated 4 years ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Generative Adversarial Networks for different impaired speech conversions☆39Jul 6, 2023Updated 2 years ago
- wake word spotting with kaldi☆19Dec 3, 2020Updated 5 years ago
- Scripts for training Kaldi for German speech recognition (ASR).☆27Feb 11, 2021Updated 5 years ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆135Dec 4, 2023Updated 2 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 4 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆90Dec 20, 2024Updated last year