repo for active speaker detection for media videos.
☆31Nov 19, 2023Updated 2 years ago
Alternatives and similar repositories for movie-asd
Users that are interested in movie-asd are comparing it to the libraries listed below
Sorting:
- This repository is a repository for the paper, "Irgun: Improved residue based gradual up-scaling network for single image super resolutio…☆15Aug 26, 2020Updated 5 years ago
- ☆21Nov 17, 2025Updated 3 months ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆167Mar 23, 2025Updated 11 months ago
- [WACV 2026] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation☆22Feb 26, 2026Updated last week
- Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset☆72Jan 18, 2022Updated 4 years ago
- Official implementation of Transpotter, published in BMVC 2021☆16Aug 6, 2022Updated 3 years ago
- ☆19Apr 18, 2024Updated last year
- INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues☆58May 29, 2023Updated 2 years ago
- Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"☆21May 21, 2021Updated 4 years ago
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.☆59Jan 24, 2024Updated 2 years ago
- The repository for Springer IJCV 2025 (LR-ASD: Lightweight and Robust Network for Active Speaker Detection)☆93Mar 23, 2025Updated 11 months ago
- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)☆68Oct 29, 2023Updated 2 years ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆65Jul 14, 2020Updated 5 years ago
- A SapientML plugin of SapientMLGenerator☆11Dec 23, 2025Updated 2 months ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- pix2pix and Cycle GAN architectures for image style transfer☆13May 27, 2021Updated 4 years ago
- ☆16Jan 16, 2023Updated 3 years ago
- ☆17Feb 17, 2026Updated 3 weeks ago
- Wikimedia Enterprise - client SDK in Python☆20Nov 11, 2025Updated 3 months ago
- https://demo-web.reflex.run☆12Apr 25, 2024Updated last year
- Open Translator: Speech To Speech and Speech to text Translator with voice cloning and other cool features☆14Updated this week
- 2D physics engine☆11Jan 12, 2023Updated 3 years ago
- ☆13Dec 8, 2022Updated 3 years ago
- ☆12Apr 21, 2025Updated 10 months ago
- A platform aimed at creating websites that perform self-optimization☆12May 4, 2024Updated last year
- Create realistic looking handwritten text PDFs from text files.☆15Jun 19, 2021Updated 4 years ago
- javascript animation capture examples 🎬☆13Mar 14, 2023Updated 2 years ago
- Automate your blogging with AI-powered tools for creating, optimizing, and deploying content. Generate SEO-optimized articles effortlessl…☆12Aug 16, 2024Updated last year
- Data: Ecosystem news, GitHub updates, discussion summaries, and other useful bits for knowledge / RAG systems☆66Updated this week
- wav2lip-api☆11Mar 16, 2023Updated 2 years ago
- ☆11Jul 19, 2023Updated 2 years ago
- ☆15Sep 18, 2025Updated 5 months ago
- Jai bindings for the sokol headers (https://github.com/floooh/sokol)☆13Mar 3, 2026Updated last week
- A team of AI agents that answer document related questions (RAG alternative)☆13Apr 16, 2025Updated 10 months ago
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆15Aug 6, 2025Updated 7 months ago
- Rust implementation of the Fift esoteric language☆12Aug 19, 2025Updated 6 months ago
- ☆14Mar 9, 2023Updated 3 years ago
- Automatic audio transcription to .srt using Google's Speech to Text API☆12Oct 26, 2020Updated 5 years ago