将视频中不同说话人的声音提取后区分保存,得到音频训练数据
☆31May 23, 2024Updated 2 years ago
Alternatives and similar repositories for speaker-diarization
Users that are interested in speaker-diarization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language☆13Jan 6, 2026Updated 5 months ago
- ☆12Aug 15, 2022Updated 3 years ago
- colorizing images☆10Sep 16, 2022Updated 3 years ago
- Using vanna framework and custom api. Vanna框架和自定义API的完整调用☆20Jul 17, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆19Aug 23, 2024Updated last year
- 基于LLM的多轮问答系统。结合了意图识别和词槽填充技术☆22Jul 30, 2025Updated 10 months ago
- A lightweight tool that efficiently isolates target speaker data from your datasets.☆20Nov 23, 2024Updated last year
- Export WAV audio files from VALORANT☆11Aug 1, 2023Updated 2 years ago
- Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"☆31Apr 29, 2022Updated 4 years ago
- Anime4k v0.9 effect shader implementation for obs to improve the quality of the preview and stream☆13Mar 19, 2024Updated 2 years ago
- An unofficial implementation of Lite-RTSE, a cost-effective lite model for real-time speech enhancement☆14Nov 19, 2023Updated 2 years ago
- Image Converter Ultra☆16Jan 29, 2026Updated 4 months ago
- Lean neural real-time acoustic echo cancellation with soft delay estimation - GGML and PyTorch inference☆103May 29, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆193Apr 28, 2026Updated last month
- The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss☆14Sep 4, 2023Updated 2 years ago
- This is a project of Interspeech2021 paper "SpecMix : A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Fea…☆11Sep 27, 2022Updated 3 years ago
- Optimizing Source and Sensor Placement for Sound Field Control☆16Mar 27, 2023Updated 3 years ago
- An unofficial non-causal Tensorflow implementation of "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Spee…☆14Dec 27, 2022Updated 3 years ago
- ☆15Sep 16, 2024Updated last year
- A code repository for the accepted paper entitled "Fast Generation of Sound Zones Using Variable Span Trade-Off Filters in the DFT-Domain…☆18Feb 17, 2025Updated last year
- Batch decryptor script for .wsdcf files VR JAV videos from DMM.☆21Jan 11, 2024Updated 2 years ago
- Cross-Layer Similarity Knowledge Distillation for Speech Enhancement☆11Jun 22, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12May 22, 2023Updated 3 years ago
- ☆17Mar 30, 2023Updated 3 years ago
- Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems☆112Jan 25, 2026Updated 4 months ago
- 使用命令行界面(CLI)或 Python 包进行简单易用的人声分离,采用各种出色的模型(主要由 @Anjok07 作为 UVR 项目的一部分训练)☆33Mar 1, 2026Updated 3 months ago
- Multizone Soundfield Reproduction☆15Mar 23, 2018Updated 8 years ago
- For creating and modifying quest for MH:F☆11Jul 17, 2020Updated 5 years ago
- It is a simple tool to convert roman script to indic(Devanagari) script. As most Keyboards are English and to write in Indic script is di…☆13Aug 31, 2016Updated 9 years ago
- dan povey's local copy of kadi-asr/kaldi☆19Nov 10, 2023Updated 2 years ago
- RVCで音声学習をするための便利スクリプト集☆26Apr 8, 2023Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆37Jan 28, 2026Updated 4 months ago
- FEETECH BUS Servo Python library☆34Jul 3, 2025Updated 11 months ago
- 这个项目是数据预处理。第一步是对获取到的音频做处理,结合Funasr的时间戳去掉空背景音。也包含了喂给BERT前的label☆15May 27, 2025Updated last year
- An implementation of rnn transducer for sequence labeling problem☆22Feb 24, 2018Updated 8 years ago
- ☆17Sep 12, 2023Updated 2 years ago
- ☆20Apr 27, 2026Updated last month
- 来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition☆29Nov 20, 2024Updated last year