[WACV 2026] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation
☆28Feb 26, 2026Updated 3 months ago
Alternatives and similar repositories for LASER_ASD
Users that are interested in LASER_ASD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space (ICML2026)☆40May 12, 2026Updated 2 weeks ago
- The repository for Springer IJCV 2025 (LR-ASD: Lightweight and Robust Network for Active Speaker Detection)☆115Mar 23, 2025Updated last year
- ☆68Sep 13, 2022Updated 3 years ago
- repo for active speaker detection for media videos.☆31Nov 19, 2023Updated 2 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This repository facilitates the creation of Python wheel files (.whl) from the tiny-cuda-nn project to streamline the installation proces…☆12Jul 2, 2025Updated 10 months ago
- ViSpeR: Multilingual Audio-Visual Speech Recognition☆58Apr 17, 2025Updated last year
- ☆64Jul 1, 2025Updated 10 months ago
- Official implementation of USR (NeurIPS 2024)☆40Dec 21, 2024Updated last year
- AMI Meeting Parallel Corpus☆12Dec 11, 2020Updated 5 years ago
- UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts☆41Jun 12, 2025Updated 11 months ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆173Mar 23, 2025Updated last year
- Face detection algorithms in PyTorch.☆81Jan 27, 2022Updated 4 years ago
- CLI for archiving pages and its all links to Wayback Machine☆14Mar 10, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Recognize speech from an audio file and convert it into animation FBX☆24Mar 7, 2022Updated 4 years ago
- ☆23Jul 30, 2024Updated last year
- The project page repo for Neural Dubber.☆30Sep 20, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 6 months ago
- 微信公众号:机器感知 | Tracking the Latest Layer Diffusion Trending☆20Dec 1, 2024Updated last year
- An ambiguous subtitles dataset for visual scene-aware machine translation☆14Oct 17, 2022Updated 3 years ago
- Make tool-calling schemas for existing tools☆14Mar 8, 2025Updated last year
- ☆37Jan 8, 2026Updated 4 months ago
- real-time transcription application☆12Jun 9, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICLR 2026] SparseD: Sparse Attention for Diffusion Language Models☆65Feb 22, 2026Updated 3 months ago
- Repo which installs other Edge AI repos to build on PC and install on a target file system☆15Aug 7, 2025Updated 9 months ago
- Make DB of Dojinvoice (DLsite)☆13Apr 25, 2026Updated last month
- ☆12Jan 27, 2017Updated 9 years ago
- ☆12Nov 25, 2021Updated 4 years ago
- This is a PyTorch implementation of a Transformer Decoder based model that plays chess.☆17Mar 15, 2024Updated 2 years ago
- This repo is re-produce for Channel_pruning☆11May 17, 2018Updated 8 years ago
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- A guide to structured generation using constrained decoding☆18Jun 9, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Retrieval Augmented Generation, but no servers involved. Backed by S3☆12Nov 3, 2023Updated 2 years ago
- Add n-gram and large language model (LLM) support to Whisper models.☆43May 6, 2025Updated last year
- AU-Expression Knowledge Constrained Representation Learning for Facial Expression Recognition (ICRA 2021)☆11Dec 29, 2023Updated 2 years ago
- automatic music transcription application written in java☆12Jan 13, 2013Updated 13 years ago
- A comfyui costume node by BillBum for using api gen (VLM LLM T2I API Tools)☆10Apr 22, 2026Updated last month
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- ☆10Apr 22, 2021Updated 5 years ago