facebookresearch / av_hubert
A self-supervised learning framework for audio-visual speech
☆897Updated last year
Alternatives and similar repositories for av_hubert:
Users that are interested in av_hubert are comparing it to the libraries listed below
- MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation☆381Updated last year
- Visual Speech Recognition for Multiple Languages☆398Updated last year
- Audio-Visual Speech Separation with Cross-Modal Consistency☆230Updated last year
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆456Updated last year
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆581Updated last year
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆405Updated last year
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆227Updated last year
- Out of time: automated lip sync in the wild☆752Updated last year
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".☆429Updated last year
- In defence of metric learning for speaker recognition☆1,093Updated last year
- Library for Textless Spoken Language Processing☆540Updated last year
- This is the GitHub page for publicly available emotional speech data.☆345Updated 3 years ago
- Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.☆698Updated last year
- INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. …☆666Updated 4 months ago
- Tools for handling speech data in machine learning projects.☆1,011Updated this week
- A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)☆475Updated last year
- An Audio Language model for Audio Tasks☆304Updated last year
- The Open Source Code of UniAudio☆556Updated 9 months ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit☆2,374Updated last month
- Learning audio concepts from natural language supervision☆547Updated 7 months ago
- ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…☆410Updated last year
- List of speech synthesis papers.☆1,036Updated last year
- This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.☆582Updated last year
- FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion☆662Updated 3 months ago
- ☆1,078Updated this week
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆550Updated 10 months ago
- A curated list of awesome voice conversion, projects and communities.☆228Updated 3 months ago
- A deep neural network architecture for low-latency audio processing☆299Updated last year
- Large, modern dataset for speech recognition☆673Updated last year
- ☆1,423Updated last year