liu12366262626/AlignVSR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/liu12366262626/AlignVSR)

liu12366262626 / AlignVSR

Visual Speech Recongnition

☆21

Alternatives and similar repositories for AlignVSR

Users that are interested in AlignVSR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liu12366262626 / CNVSRC2025
View on GitHub
Official CNVSRC2025 Competition Baseline
☆17Jun 27, 2025Updated last year
vTAD2025-Challenge / vTAD
View on GitHub
☆17Oct 24, 2025Updated 9 months ago
JeongHun0716 / zero-avsr
View on GitHub
Official PyTorch implementation for "Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech …
☆37May 11, 2025Updated last year
DataoceanAI / CNVSRC2023Baseline
View on GitHub
Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)
☆23Apr 27, 2024Updated 2 years ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JunyiPeng00 / SLT22_MultiHead-Factorized-Attentive-Pooling
View on GitHub
An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
☆24Sep 22, 2024Updated last year
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
ductuantruong / enskd
View on GitHub
[ICASSP'24] Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
☆16Mar 20, 2024Updated 2 years ago
BiSinger-SVS / BiSinger
View on GitHub
Bilingual Singing Voice Synthesis
☆18Mar 25, 2024Updated 2 years ago
KAIST-AILab / SyncVSR
View on GitHub
[Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
☆61Mar 26, 2025Updated last year
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
mavceleb / mavceleb_baseline
View on GitHub
☆11Nov 5, 2025Updated 8 months ago
mpc001 / auto_avsr
View on GitHub
Auto-AVSR: Lip-Reading Sentences Project
☆429Jan 8, 2025Updated last year
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
qiuqiao / DDSP-HiFiGAN
View on GitHub
基于PC-DDSP和nsf-HiFiGAN的声码器
☆19Jul 17, 2023Updated 3 years ago
zds-potato / multilingual-phonetic-sv
View on GitHub
☆10Dec 22, 2023Updated 2 years ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
ydqmkkx / ShallowFlowMatching-TTS
View on GitHub
Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis
☆55Sep 20, 2025Updated 10 months ago
AMAAI-Lab / DART
View on GitHub
Demo for DART, Audio Imagination workshop submission in NeurIPS 2024
☆16Apr 22, 2026Updated 3 months ago
IDRnD / VoxTube
View on GitHub
The VoxTube dataset official repository
☆71Feb 14, 2024Updated 2 years ago
yochaiye / LipVoicer
View on GitHub
Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"
☆88Sep 19, 2024Updated last year
wenet-e2e / wesr
View on GitHub
We Speech Transcript based on LLM, in 300 lines of code.
☆182Jun 20, 2025Updated last year
samsad35 / code-ancogen
View on GitHub
[ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
☆14Mar 11, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
khanld / Dynamic-Mixing
View on GitHub
Dynamic Mixing For Speech Processing (mix-on-the-fly)
☆22Jul 19, 2022Updated 4 years ago
choijeongsoo / lip2speech-unit
View on GitHub
[Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units
☆47Oct 26, 2024Updated last year
cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆18Jul 22, 2024Updated 2 years ago
naba89 / custom_hf_trainer
View on GitHub
A custom Huggingface trainer which supports logging auxiliary losses returned by your model
☆15Jul 27, 2025Updated last year
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
ahaliassos / usr
View on GitHub
Official implementation of USR (NeurIPS 2024)
☆40Dec 21, 2024Updated last year
malradhi / PACodec
View on GitHub
[ICASSP 2026]Official code for "Prosody-Guided Harmonic Attention for Phase-Coherent Neural Vocoding in the Complex Spectrum"
☆27Jan 22, 2026Updated 6 months ago
y-ren16 / OV-InstructTTS
View on GitHub
☆22Jan 27, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
AdvSV / AdvSV.github.io
View on GitHub
AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…
☆11Nov 21, 2023Updated 2 years ago
PINTO0309 / onnx-aec
View on GitHub
A playground for experimenting with acoustic echo cancellation using a microphone, speaker, and ONNX.
☆13Oct 22, 2024Updated last year
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
pengzhendong / streaming-tts-webui
View on GitHub
Streaming Text to Speech Web UI
☆22May 6, 2024Updated 2 years ago
drscotthawley / fad_pytorch
View on GitHub
Frechet Audio Distance evaluation in PyTorch
☆36Jun 9, 2023Updated 3 years ago