SMIL-SPCRAS / DAVISLinks

Official repo for "Audio-Visual Speech Recognition In-the-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-based Method" in ICASSP 2024

☆9

Alternatives and similar repositories for DAVIS

Users that are interested in DAVIS are comparing it to the libraries listed below

Sorting:

etri / kmsav
☆11Updated 9 months ago
audiodemo / voice-conversion
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Updated last year
bagustris / ssl-ser
Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"
☆10Updated 2 years ago
amphionspace / tts-evaluation
An evaluation set for large-scale trained TTS models (Coming in Sep 2024)
☆12Updated 11 months ago
liu12366262626 / AlignVSR
Visual Speech Recongnition
☆18Updated 7 months ago
ZehuaKcrissLi / GTR-Voice
☆13Updated 9 months ago
ashi-ta / speechGLUE
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Updated 2 years ago
bshall / dusted
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Updated 10 months ago
shinhyeokoh / rwen
☆14Updated 2 years ago
declare-lab / HyperTTS
☆36Updated last year
RanaCM / DSU-AVO
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12Updated last year
kjw11 / Speaker-Aware-CTC
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆20Updated 2 months ago
WangHelin1997 / Automatic_Speech_Annotator
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Updated last year
shengcanxu / canoSpeech
text to speech
☆10Updated last year
Tele-AI / TELEVAL
☆16Updated 2 weeks ago
cpii-cai / PunCantonese
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆14Updated 8 months ago
huutuongtu / Lightvoc
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18Updated last year
cyhuang-tw / robust-vc
☆11Updated 3 years ago
skhu101 / Bayesian_TDNN
This repository contains the Kaldi LF-MMI implementation of the paper "Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for…
☆9Updated 3 years ago
WangHelin1997 / Aty-TTS
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆10Updated 2 months ago
csalt-research / accented-codebooks-asr
☆19Updated 11 months ago
pkufool / simple-wer
A simple command line tool to calculate WER for ASR.
☆14Updated 9 months ago
utter-project / mHuBERT-147-scripts
Collection of scripts from mHuBERT-147.
☆29Updated 8 months ago
TTS-Research / PEL-TTS
☆14Updated last year
ex3ndr / supervoice-hybrid
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Updated last year
nonverbalspeech38k / nonverspeech38k
The official repository for NonVerbalSpeech-38K.
☆15Updated last week
shivammehta25 / BetterFastSpeech2
Just another FastSpeech 2 but cleaner code :)
☆26Updated last year
ogunlao / glowtts_stdp
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆18Updated 2 years ago
haoheliu / ontology-aware-audio-tagging
☆13Updated 2 years ago
backspacetg / distilXLSR
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆12Updated 4 months ago