an optimized, production-ready implementation of active speaker detection
☆82May 29, 2024Updated last year
Alternatives and similar repositories for fast-asd
Users that are interested in fast-asd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆463Oct 23, 2023Updated 2 years ago
- The purpose of this repository is to discuss on Audio transformers☆14Mar 12, 2026Updated 3 weeks ago
- The code for some apps built with Sieve.☆85Nov 22, 2024Updated last year
- code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection☆52May 1, 2023Updated 2 years ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset☆72Jan 18, 2022Updated 4 years ago
- This Repository demostrates various examples using YOLO☆13Feb 9, 2024Updated 2 years ago
- MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models☆15May 14, 2024Updated last year
- A quality zero-shot lipsync pipeline built with MuseTalk, LivePortrait, and CodeFormer.☆49Sep 25, 2024Updated last year
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆26Mar 28, 2025Updated last year
- The implement of LLMTreeRec☆14Dec 9, 2024Updated last year
- Accurately locating each head's position in the crowd scenes is a crucial task in the field of crowd analysis. However, traditional densi…☆21Mar 16, 2024Updated 2 years ago
- ☆14Oct 26, 2023Updated 2 years ago
- Automatically turn your handwritten journal entries into a website using GPT3 OCR python and html☆13Dec 15, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆10Feb 23, 2025Updated last year
- Linux daemon to bind touchpad gestures to shell commands.☆11Jun 25, 2024Updated last year
- Streamlit-Based License Plate Recognition (LPR) App☆12Mar 26, 2025Updated last year
- ☆21Aug 21, 2024Updated last year
- Summarizing with LLMs: Using an LLM to understand GitHub issues without reading each post in detail.☆15Jul 22, 2024Updated last year
- Simple, efficient and cross-platform TFIDF-based text summarizer in Rust☆13Apr 12, 2024Updated last year
- ☆15Mar 18, 2026Updated 3 weeks ago
- Python app to sync Video Files to the beat of a song☆12Aug 5, 2019Updated 6 years ago
- ☆15May 13, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Content Aware Fill for Linux's Python☆10Jul 6, 2022Updated 3 years ago
- The implementation of g2pL with a new open dataset.☆16May 14, 2023Updated 2 years ago
- Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released☆11Nov 8, 2021Updated 4 years ago
- Image perspective transformation and text recognition☆10Jun 26, 2020Updated 5 years ago
- Add Rain Streak Mask On Unparied Image Using GAN☆10Sep 12, 2020Updated 5 years ago
- ☆10Aug 19, 2024Updated last year
- Streamlit component like Microsoft Excel☆25Sep 7, 2022Updated 3 years ago
- ☆45Jan 17, 2023Updated 3 years ago
- repo for active speaker detection for media videos.☆31Nov 19, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [CVPR2025] KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation☆69Apr 8, 2025Updated last year
- Official repo of the paper “AL-GTD: Deep Active Learning for Gaze Target Detection” (ACMMM2024)☆12Nov 29, 2024Updated last year
- ☆12May 25, 2024Updated last year
- Developing a news app with different recommender systems☆13May 22, 2023Updated 2 years ago
- Generate PDFs using libharu from Rust☆20Nov 16, 2024Updated last year
- Official repo of the paper "Object-aware Gaze Target Detection" (ICCV 2023)☆46Dec 5, 2024Updated last year
- An Agentic RAG starter that use Swarm, Nemo Guardrails and SingleStore as a database☆29Dec 18, 2024Updated last year