RanaCM/DSU-AVO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RanaCM/DSU-AVO)

RanaCM / DSU-AVO

Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023

☆12

Alternatives and similar repositories for DSU-AVO

Users that are interested in DSU-AVO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

W-Wu / ERC-SLT22
View on GitHub
Code for "Distribution-based Emotion Recognition in Conversation"
☆18Feb 6, 2023Updated 3 years ago
xinshengwang / robpitch
View on GitHub
A pitch detection model trained to be robust against noise and reverberation environments.
☆27Jan 21, 2025Updated last year
kaist-ami / voicecraft-dub
View on GitHub
[ICCV'25] Official PyTorch Implementation of "VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models"
☆17Dec 8, 2025Updated 7 months ago
chenqi008 / V2C
View on GitHub
Pytorch implementation for “V2C: Visual Voice Cloning”
☆34Jan 28, 2023Updated 3 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
naver-ai / facetts
View on GitHub
☆61May 17, 2023Updated 3 years ago
ms-dot-k / Lip-to-Speech-Synthesis-in-the-Wild
View on GitHub
PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)
☆71Mar 9, 2024Updated 2 years ago
thuhcsi / icassp2021-emotion-tts
View on GitHub
Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/
☆34Mar 17, 2023Updated 3 years ago
zexupan / reentry
View on GitHub
☆18Nov 22, 2024Updated last year
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
adelacvg / DPTTS
View on GitHub
An AR+AR TTS attempt.
☆18Jan 13, 2025Updated last year
GalaxyCong / HPMDubbing
View on GitHub
[CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.
☆112Jun 21, 2024Updated 2 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
kaist-ami / SoundBrush
View on GitHub
☆14Dec 8, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Moon0316 / T2A
View on GitHub
Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023
☆86Oct 10, 2023Updated 2 years ago
DavidMChan / Anim400K
View on GitHub
Anim-400K: A dataset designed from the ground up for automated dubbing of video
☆118Jun 21, 2024Updated 2 years ago
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
Takaaki-Saeki / ssl_speech_restoration_v2
View on GitHub
☆17Dec 18, 2023Updated 2 years ago
JiuFengSC / ElasticAST
View on GitHub
Official code of ElasticAST (Interspeech 2024 paper)
☆34Jul 30, 2024Updated last year
b04901014 / UUVC
View on GitHub
Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…
☆83Jan 7, 2023Updated 3 years ago
kaist-ami / AVHBench
View on GitHub
[ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"
☆25Mar 8, 2026Updated 4 months ago
tail95 / Voice-Cloning
View on GitHub
Clone a voice in 5 seconds to generate arbitrary speech in real-time
☆10Aug 1, 2019Updated 6 years ago
huckiyang / Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial
View on GitHub
Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling
☆15Oct 9, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
rsnikhil / DEVEL_Learn_Bluespec_and_RISCV_Design
View on GitHub
Development area for another repo: Learn_Bluespec_and_RISCV_Design
☆13Jun 28, 2026Updated 3 weeks ago
cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
google-research-datasets / LLAMA1-Test-Set
View on GitHub
We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…
☆23Mar 14, 2024Updated 2 years ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
Takaaki-Saeki / zm-text-tts
View on GitHub
[IJCAI'23] Learning to Speak from Text for Low-Resource TTS
☆65May 30, 2023Updated 3 years ago
3loi / MSP_Face
View on GitHub
☆13Nov 15, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
voidful / vall-e-encodec
View on GitHub
☆41May 15, 2023Updated 3 years ago
gladiaio / normalization
View on GitHub
A lightweight library for normalizing speech transcripts before computing WER
☆28Jul 14, 2026Updated last week
MWM-io / nansypp
View on GitHub
Unofficial implementation of NANSY++ in Pytorch Lightning
☆50Mar 11, 2024Updated 2 years ago
thunlp / duplex-model
View on GitHub
☆48Aug 17, 2024Updated last year
BASHLab / OWL
View on GitHub
☆15May 25, 2026Updated 2 months ago
choijeongsoo / utut
View on GitHub
[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
☆31Sep 6, 2024Updated last year
hs-oh-prml / DurFlexEVC
View on GitHub
☆82Jan 22, 2025Updated last year