lissettecarlr / speaker-diarizationView external linksLinks
将视频中不同说话人的声音提取后区分保存,得到音频训练数据
☆27May 23, 2024Updated last year
Alternatives and similar repositories for speaker-diarization
Users that are interested in speaker-diarization are comparing it to the libraries listed below
Sorting:
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems☆75Jan 25, 2026Updated 3 weeks ago
- Virtual news production using Tacotron2 and Wav2Lip☆11Nov 14, 2023Updated 2 years ago
- Eliza Agent Weaver enables you to develop a set of Character files based on your own lore, and connects the narratives of multiple agents…☆10Dec 12, 2024Updated last year
- BanterBot: An OpenAI ChatGPT-powered chatbot with Azure Neural Voices. Supports multilingual speech-to-text and text-to-speech interactio…☆11Jan 23, 2026Updated 3 weeks ago
- C# console-app to split a comic-page into it's separate panels☆13Jun 22, 2020Updated 5 years ago
- ☆15Sep 16, 2024Updated last year
- VexFS is a Linux kernel-native file system with built-in vector search and semantic memory. Designed for AI agents, RAG, and LLM workload…☆24Oct 19, 2025Updated 3 months ago
- An open-source platform for building and deploying real-time, low-latency AI voice agents for call automation for marketing.☆18Oct 16, 2025Updated 4 months ago
- automatic music transcription application written in java☆12Jan 13, 2013Updated 13 years ago
- ☆16Apr 10, 2025Updated 10 months ago
- Agent building tools via block diagram UI☆12Dec 31, 2025Updated last month
- ☆10Apr 22, 2021Updated 4 years ago
- A Cyberpunk 2077 First-Person Multi Rig for Blender (4.0+)☆11Jan 10, 2026Updated last month
- An unofficial implementation of Lite-RTSE, a cost-effective lite model for real-time speech enhancement☆14Nov 19, 2023Updated 2 years ago
- Implementing an interactive AI avatar using Python, Blender and GPT☆11Dec 5, 2023Updated 2 years ago
- A high-performance, distributed memory management system for LLM agents built with LangGraph, LangChain, Ray, and vLLM. Features multi-la…☆11Apr 23, 2025Updated 9 months ago
- ☆11May 2, 2022Updated 3 years ago
- ☆10Sep 2, 2024Updated last year
- Chatbot for NHS Medicines A-Z. Agentic Retrieval Augmented Generation utilising the OpenAI API, LangChain, and LangGraph to query a vecto…☆10Jun 24, 2024Updated last year
- PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…☆11Jun 28, 2021Updated 4 years ago
- A proposed GPT chatbot for teachers that uses retrieval-augmentation to answer questions about their students.☆10Dec 7, 2024Updated last year
- Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.☆14Aug 13, 2023Updated 2 years ago
- A project about Virtual Try-On. Lines of code ~5,200.☆10Jan 27, 2021Updated 5 years ago
- end-to-end automated video generation pipeline designed to create engaging, TikTok-style viral short videos using AI.☆20Jun 7, 2025Updated 8 months ago
- Cross-Layer Similarity Knowledge Distillation for Speech Enhancement☆11Jun 22, 2023Updated 2 years ago
- Talk to your database as if you were chatting with a friend. Turn natural language into powerful SQL queries effortlessly, and get your a…☆10Nov 12, 2024Updated last year
- calvis: Chest, wAist and peLVIS circumference from 3D human Body meshes for Deep Learning.☆11May 15, 2025Updated 9 months ago
- Code for TCSVT paper "Exploring Spatio-Temporal Graph Convolution for Video-based Human-Object Interaction Recognition"☆12Mar 30, 2023Updated 2 years ago
- EaseVoice Trainer is a simple and user-friendly voice cloning and speech model trainer.☆14Apr 27, 2025Updated 9 months ago
- Multi-tenant RAG API powered by LightRAG/RAG-Anything. Auto-selects best parser (DeepSeek-OCR/MinerU/Docling) via complexity scoring☆24Dec 15, 2025Updated 2 months ago
- real-time web visualizer for 3D gaussian splatting☆10Jan 31, 2025Updated last year
- Official repository of Tapir Lab.'s Lip-Sync Method☆10Oct 3, 2023Updated 2 years ago
- This is a project of Interspeech2021 paper "SpecMix : A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Fea…☆11Sep 27, 2022Updated 3 years ago
- An unofficial non-causal Tensorflow implementation of "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Spee…☆14Dec 27, 2022Updated 3 years ago
- The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss☆14Sep 4, 2023Updated 2 years ago
- ☆12Aug 17, 2024Updated last year
- ☆15Oct 10, 2023Updated 2 years ago
- ☆11Jun 19, 2024Updated last year