sieve-community / fast-asdLinks
an optimized, production-ready implementation of active speaker detection
☆78Updated last year
Alternatives and similar repositories for fast-asd
Users that are interested in fast-asd are comparing it to the libraries listed below
Sorting:
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Updated 2 years ago
- Efficient approach to speaker diarization using voice characteristics extraction☆105Updated 7 months ago
- Our idea is to combine the power of computer vision model and LLMs. We use YOLO, CLIP and DINOv2 to extract high-level features from imag…☆118Updated 2 years ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆161Updated last year
- Use Grounding DINO, Segment Anything, and GPT-4V to label images with segmentation masks for use in training smaller, fine-tuned models.☆66Updated 2 years ago
- ☆206Updated last year
- ☆157Updated 2 years ago