lissettecarlr/speaker-diarization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lissettecarlr/speaker-diarization)

lissettecarlr / speaker-diarization

将视频中不同说话人的声音提取后区分保存，得到音频训练数据

☆31

Alternatives and similar repositories for speaker-diarization

Users that are interested in speaker-diarization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

leohuang2013 / pyannote-audio_overlapped-speech-detection_cpp
View on GitHub
C++ version of pyannote audio overlapped speech detection pipeline
☆13Feb 14, 2024Updated 2 years ago
emVisible / emRag
View on GitHub
基于LangChain + Xinference + Chroma构建的本地知识库
☆12Jun 13, 2025Updated last year
UlugbekSalaev / UzTransliterator
View on GitHub
UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language
☆13Jan 6, 2026Updated 6 months ago
Allen-lz / audio2face_pytorch
View on GitHub
☆12Aug 15, 2022Updated 3 years ago
Apauto-to-all / GPT-soVITS-Inference-batchTool
View on GitHub
这是一个批量推理工具，对同一段文字进行多次推理，并且支持随机参数，直到筛选出最满意的结果。
☆11Aug 19, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hbwu-ntu / EmoCtrlTTS-Eval
View on GitHub
☆19Aug 23, 2024Updated last year
megaease / easevoice-trainer-portal
View on GitHub
EaseVoice Trainer is a simple and user-friendly voice cloning and speech model trainer.
☆15Apr 27, 2025Updated last year
stack-over-flo-w / Vanna_RAG_llm
View on GitHub
Using vanna framework and custom api. Vanna框架和自定义API的完整调用
☆20Jul 17, 2024Updated 2 years ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
ardyadipta / gemini_chatbot_sql
View on GitHub
Create Chatbot using Gemini and RAG that could read from SQL databases
☆17Dec 5, 2024Updated last year
RicherMans / PSL
View on GitHub
Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"
☆31Apr 29, 2022Updated 4 years ago
Shy2593666979 / Agent_Multiple-Talk
View on GitHub
基于LLM的多轮问答系统。结合了意图识别和词槽填充技术
☆24Jul 30, 2025Updated 11 months ago
SpeechColab / GigaSpeech2
View on GitHub
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
☆197Apr 28, 2026Updated 2 months ago
Xiaobin-Rong / lite-rtse
View on GitHub
An unofficial implementation of Lite-RTSE, a cost-effective lite model for real-time speech enhancement
☆14Nov 19, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
sh01k / SourceSensorPlacementSFC
View on GitHub
Optimizing Source and Sensor Placement for Sound Field Control
☆16Mar 27, 2023Updated 3 years ago
GT-KIM / specmix
View on GitHub
This is a project of Interspeech2021 paper "SpecMix : A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Fea…
☆11Sep 27, 2022Updated 3 years ago
jdonley / MSR
View on GitHub
Multizone Soundfield Reproduction
☆15Mar 23, 2018Updated 8 years ago
nightmoonbridge / vast_dft
View on GitHub
A code repository for the accepted paper entitled "Fast Generation of Sound Zones Using Variable Span Trade-Off Filters in the DFT-Domain…
☆18Feb 17, 2025Updated last year
amrrs / deepseek-r1-agent
View on GitHub
Deepseek R1 Agent powered by LMStudio and Smolagents
☆32Jan 21, 2025Updated last year
Clovermax / AED-TSVAD
View on GitHub
Attention-Based Encoder-Decoder Target-Speaker Voice Activity Detection for Robust Speaker Diarization
☆31Sep 22, 2025Updated 10 months ago
puneetsl / Romadeva
View on GitHub
It is a simple tool to convert roman script to indic(Devanagari) script. As most Keyboards are English and to write in Indic script is di…
☆13Aug 31, 2016Updated 9 years ago
scutcsq / DWFormer
View on GitHub
DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)
☆69Jul 8, 2024Updated 2 years ago
Vision-CAIR / dochaystacks
View on GitHub
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents, CVPR 2025
☆26Jan 25, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
KhanhNguyen4999 / Speech-Enhancement-CLSKD
View on GitHub
Cross-Layer Similarity Knowledge Distillation for Speech Enhancement
☆11Jun 22, 2023Updated 3 years ago
sp-uhh / uncertainty-SE
View on GitHub
☆17Mar 30, 2023Updated 3 years ago
brucecxm / bruce-cxm-douyin
View on GitHub
本项目数据库设计、技术选型、前后端代码编写等，全部由本人完成。本人特别爱刷抖音，有一天突发奇想我能不能自己做个抖音，恰好我学习的软件开发技能还没什么用武之地，于是这个项目便诞生了。
☆10Apr 7, 2026Updated 3 months ago
danpovey / kaldi
View on GitHub
dan povey's local copy of kadi-asr/kaldi
☆19Nov 10, 2023Updated 2 years ago
mit-ll / linkq
View on GitHub
☆30May 21, 2026Updated 2 months ago
staymylove / 3DMIT
View on GitHub
Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING
☆32Jul 26, 2024Updated last year
TeaPoly / PLCPA-ASYM-Loss
View on GitHub
The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss
☆15Sep 4, 2023Updated 2 years ago
node-modules / connection
View on GitHub
☆11Jun 19, 2024Updated 2 years ago
Okrio / deepvqe
View on GitHub
☆14Oct 12, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ASLP-lab / Easy-Turn
View on GitHub
Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
☆121Jan 25, 2026Updated 5 months ago
kuijiang94 / leetcode-master
View on GitHub
LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀
☆15Jul 12, 2021Updated 5 years ago
Aworselife / DPTBF
View on GitHub
☆17Sep 12, 2023Updated 2 years ago
HuangZikang-TJU / Aug4TSE
View on GitHub
☆15Sep 16, 2024Updated last year
L6-NLP / Generative-Annotation-NEC
View on GitHub
Generative_Annotation_NEC: A novel NEC method that utilizes speech sound features to retrieve candidate entities and a generative method …
☆17Dec 2, 2025Updated 7 months ago
2DIPW / audio_dataset_screener
View on GitHub
An auxiliary tool for manual screening of audio dataset.
☆132Jun 23, 2023Updated 3 years ago
mnansary / bnUnicodeNormalizer
View on GitHub
Bangla Unicode Normalization
☆24May 26, 2024Updated 2 years ago