IIP-Sogang/olkavs-avspeech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IIP-Sogang/olkavs-avspeech)

IIP-Sogang / olkavs-avspeech

The Introduction of the OLKAVS Dataset

☆39

Alternatives and similar repositories for olkavs-avspeech

Users that are interested in olkavs-avspeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shleee47 / Sound-Source-Localization
View on GitHub
Sound Source Localization for AI Grand Challenge 2021
☆21Feb 7, 2022Updated 4 years ago
kooBH / drone-robust-gender-classification
View on GitHub
인명 구조용 드론을 위한 음성 화자 인지 기술
☆31Jan 31, 2023Updated 3 years ago
shleee47 / mpWAV-Sound-Source-Localization
View on GitHub
Sound Source Localization for AI Grand Challenge 2021
☆22Feb 8, 2022Updated 4 years ago
kooBH / PCM-A10-SSL
View on GitHub
Sound Source Localization for PCM-A10 Microphone
☆33Jan 31, 2023Updated 3 years ago
CherokeeLanguage / Cherokee-TTS
View on GitHub
Using Tacotron2 to do Cherokee Text to Speech
☆10May 10, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
dmlguq456 / NeXt_TDNN_ASV
View on GitHub
Official repository of NeXt-TDNN for speaker verification
☆85Oct 10, 2024Updated last year
prajwalkr / vtp
View on GitHub
Official Implementation of Visual Transformer Pooling for Lip reading
☆41Aug 8, 2022Updated 3 years ago
Miraikomachi / AIVoiceSamples
View on GitHub
☆15Jun 4, 2021Updated 5 years ago
prajwalkr / transpotter
View on GitHub
Official implementation of Transpotter, published in BMVC 2021
☆16Aug 6, 2022Updated 3 years ago
koreanAI / 2023-Korean-AI-Competition
View on GitHub
2023 한국어 AI 경진대회
☆10Oct 30, 2023Updated 2 years ago
ahstarwab / Violence_Detection
View on GitHub
Online and real-time violence recognition
☆16Jul 5, 2022Updated 4 years ago
keonlee9420 / Deep-Learning-TTS-Template
View on GitHub
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).
☆14Jun 15, 2021Updated 5 years ago
etri / kmsav
View on GitHub
☆14Oct 25, 2024Updated last year
jhCOR / EgoOrientBench
View on GitHub
The Official Code Repo for EgoOrientBench [CVPR25]
☆17Nov 24, 2025Updated 8 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
revospeech / audio-generation-papers
View on GitHub
recent audio generation papers (including speech, music and general audios)
☆13Mar 14, 2023Updated 3 years ago
mpc001 / Visual_Speech_Recognition_for_Multiple_Languages
View on GitHub
Visual Speech Recognition for Multiple Languages
☆479Aug 17, 2023Updated 2 years ago
ms-dot-k / Visual-Context-Attentional-GAN
View on GitHub
PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)
☆25Mar 9, 2024Updated 2 years ago
justinjohn0306 / Audio-Splitter
View on GitHub
Audio Splitter provides a user-friendly solution for splitting audio files based on silence detection.
☆18May 28, 2023Updated 3 years ago
ISmallFish / Libri-adhoc40
View on GitHub
A dataset collected from synchronized ad-hoc microphone arrays
☆19Apr 24, 2023Updated 3 years ago
KrishnaDN / Keyword-Transformer
View on GitHub
Implementation of the paper "Keyword Transformer: A Self-Attention Model for Keyword Spotting"
☆23May 19, 2021Updated 5 years ago
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
Sleepwalking / deadfish
View on GitHub
A very simple audio editing tool.
☆14Jun 5, 2022Updated 4 years ago
DinoMan / face-processor
View on GitHub
Aligns faces to the canonical face in both videos and images
☆17Apr 11, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
UtaUtaUtau / nnsvslabeling
View on GitHub
Python scripts I made to make NNSVS labeling easier.
☆27Jun 20, 2023Updated 3 years ago
JuanFMontesinos / VoViT
View on GitHub
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
☆35Mar 18, 2023Updated 3 years ago
TomohikoNakamura / asteroid_jaCappella
View on GitHub
☆14Jul 28, 2023Updated 3 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
mpc001 / auto_avsr
View on GitHub
Auto-AVSR: Lip-Reading Sentences Project
☆429Jan 8, 2025Updated last year
kaistmm / VoxMM
View on GitHub
☆23May 11, 2026Updated 2 months ago
audeering / w2v2-age-gender-how-to
View on GitHub
How to use our public wav2vec2 age and gender model
☆55Sep 4, 2023Updated 2 years ago
FrePainter / code
View on GitHub
☆28Mar 28, 2024Updated 2 years ago
mavceleb / mavceleb_baseline
View on GitHub
☆11Nov 5, 2025Updated 8 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Neclow / SERAB
View on GitHub
SERAB: a multi-lingual benchmark for speech emotion recognition
☆28Dec 16, 2022Updated 3 years ago
Dreamtonics / svstudio-translations
View on GitHub
GUI translations for Synthesizer V Studio.
☆34Apr 4, 2024Updated 2 years ago
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆438May 18, 2023Updated 3 years ago
jonghwanhyeon / hangul-jamo
View on GitHub
A library to compose and decompose Hangul syllables using Hangul jamo characters
☆29Apr 22, 2022Updated 4 years ago
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
nttcslab-sp / agevoxceleb
View on GitHub
☆28Dec 22, 2021Updated 4 years ago
Sanyuan-Chen / CSS_with_EETransformer
View on GitHub
Code for the ICASSP-2021 paper: Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer
☆12Sep 2, 2021Updated 4 years ago