an optimized, production-ready implementation of active speaker detection
☆82May 29, 2024Updated last year
Alternatives and similar repositories for fast-asd
Users that are interested in fast-asd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆467Oct 23, 2023Updated 2 years ago
- The purpose of this repository is to discuss on Audio transformers☆14Apr 16, 2026Updated 2 weeks ago
- The code for some apps built with Sieve.☆85Nov 22, 2024Updated last year
- code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection☆54May 1, 2023Updated 3 years ago
- This Repository demostrates various examples using YOLO☆13Feb 9, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models☆15May 14, 2024Updated last year
- MCP Filesystem tools written in Rust☆22Mar 6, 2026Updated last month
- A quality zero-shot lipsync pipeline built with MuseTalk, LivePortrait, and CodeFormer.☆49Sep 25, 2024Updated last year
- ECCV 2024 STMA & CVPR 2024 1st MOSE & 1st VOT Challenge & 1st LSVOS v6☆12Oct 16, 2024Updated last year
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆26Mar 28, 2025Updated last year
- The implement of LLMTreeRec☆14Dec 9, 2024Updated last year
- A Slate plugin to handle onChange event on silence without event stack. Useful for implementing auto save Editor.☆15Jul 17, 2023Updated 2 years ago
- Accurately locating each head's position in the crowd scenes is a crucial task in the field of crowd analysis. However, traditional densi…☆21Mar 16, 2024Updated 2 years ago
- Face Verification API☆11Sep 27, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆15Oct 26, 2023Updated 2 years ago
- Automatically turn your handwritten journal entries into a website using GPT3 OCR python and html☆13Dec 15, 2021Updated 4 years ago
- The fastest Whisper optimization for automatic speech recognition as a command-line interface ⚡️☆10Dec 3, 2023Updated 2 years ago
- Automatic knowledge graph generation for Obsidian.md☆29Sep 13, 2023Updated 2 years ago
- Streamlit-Based License Plate Recognition (LPR) App☆12Mar 26, 2025Updated last year
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- ☆21Aug 21, 2024Updated last year
- Summarizing with LLMs: Using an LLM to understand GitHub issues without reading each post in detail.☆15Jul 22, 2024Updated last year
- Python app to sync Video Files to the beat of a song☆12Aug 5, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆14Oct 2, 2017Updated 8 years ago
- A web application for a simplified version of a digital book collection, using FastAPI for the backend and React for the frontend.☆12Jan 9, 2025Updated last year
- The implementation of g2pL with a new open dataset.☆16May 14, 2023Updated 2 years ago
- Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released☆11Nov 8, 2021Updated 4 years ago
- code to help with tsne plotting☆16May 19, 2020Updated 5 years ago
- Deep Audio Segmenter, unsupervised☆10Feb 20, 2026Updated 2 months ago
- ☆10Aug 19, 2024Updated last year
- Simple playground chat app that interacts with OpenAI's functions with memory and custom tools.☆17Jul 11, 2023Updated 2 years ago
- ShellSpeak translates natural language to shell commands, simplifying system interactions for non-tech-savvy users. With color-coded UI, …☆12Nov 26, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- repo for active speaker detection for media videos.☆31Nov 19, 2023Updated 2 years ago
- A plugin for connecting Open MCT to ROS 2 (and maybe ROS 1).☆10Jun 5, 2025Updated 10 months ago
- [CVPR2025] KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation☆71Apr 8, 2025Updated last year
- Official repo of the paper “AL-GTD: Deep Active Learning for Gaze Target Detection” (ACMMM2024)☆12Nov 29, 2024Updated last year
- ☆13May 25, 2024Updated last year
- Deep learning and standard machine learning methods are developed and compared in classfying audio samples from microphones deployed abo…☆11Jan 17, 2020Updated 6 years ago
- UpToDateAI, an open source tool to help you help AI assist you with coding and debugging in lesser-known or newly released programming fr…☆12Sep 10, 2024Updated last year