lovemefan/fsmn-vad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lovemefan/fsmn-vad)

lovemefan / fsmn-vad

A enterprise-grade Voice Activity Detector from modelscope and funasr.

☆139

Alternatives and similar repositories for fsmn-vad

Users that are interested in fsmn-vad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
nttcslab-sp / mamba-diarization
View on GitHub
Official repository for Mamba-based Segmentation Model for Speaker Diarization
☆47May 13, 2025Updated last year
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lovemefan / paraformer-python
View on GitHub
paraformer(chinense asr) online onnx runtime for python
☆54Mar 27, 2024Updated 2 years ago
Mddct / WeUSM
View on GitHub
☆13Mar 30, 2023Updated 3 years ago
daihuangyu / speex_aec_kf
View on GitHub
speex aec kalman filter
☆15Mar 17, 2024Updated 2 years ago
JethroWangSir / SincQDR-VAD
View on GitHub
☆26Aug 29, 2025Updated 11 months ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
lovemefan / CT-Transformer-punctuation
View on GitHub
A enterprise-grade Chinese-English code switch punctuator from funasr.
☆34Apr 26, 2024Updated 2 years ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
PINTO0309 / onnx-aec
View on GitHub
A playground for experimenting with acoustic echo cancellation using a microphone, speaker, and ONNX.
☆13Oct 22, 2024Updated last year
lovemefan / campplus
View on GitHub
A open-source toolkit for single and multi-modal speaker verification from modelscope and funasr with onnx
☆15Dec 16, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
fclearner / Personal-vad-2.0
View on GitHub
Implementation of "Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition"
☆16Jun 9, 2026Updated last month
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
zhuzizyf / damo-fsmn-vad-infer-httpserver
View on GitHub
达摩fsmn vad c++推理服务
☆17Apr 17, 2023Updated 3 years ago
pengzhendong / asr-decoder
View on GitHub
CTC decoder with hotwords for ASR.
☆38Jun 15, 2026Updated last month
csukuangfj / kaldi-native-fbank
View on GitHub
Kaldi-compatible online fbank extractor without external dependencies
☆152Oct 9, 2025Updated 9 months ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
chenkui164 / FastASR
View on GitHub
这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小…
☆553Mar 19, 2023Updated 3 years ago
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆35Sep 25, 2025Updated 10 months ago
mkunes / w2v2_audioFrameClassification
View on GitHub
wav2vec2 audio classification for prosodic boundary detection and other tasks
☆42Aug 11, 2023Updated 2 years ago
Slyne / ctc_decoder
View on GitHub
A ctc decoder for both online and offline asr model
☆66Nov 18, 2023Updated 2 years ago
wenet-e2e / wesr
View on GitHub
We Speech Transcript based on LLM, in 300 lines of code.
☆182Jun 20, 2025Updated last year
Audio-WestlakeU / FS-EEND
View on GitHub
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …
☆183May 7, 2026Updated 2 months ago
ScottishFold007 / TTSAudioNormalizer
View on GitHub
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…
☆113Dec 20, 2024Updated last year
SpeechColab / GigaSpeech2
View on GitHub
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
☆198Apr 28, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
RapidAI / RapidASR
View on GitHub
📣 商用级开源语音自动识别程序库，开箱即用，全平台支持，中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide …
☆608May 15, 2024Updated 2 years ago
Tzenthin / wenet_mnn
View on GitHub
语音识别模型pytorch转ONNX转MNN，C++实现部署
☆85Sep 1, 2022Updated 3 years ago
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
tarun360 / SpeakerProfiling
View on GitHub
Estimating the Age, Height, and Gender of a speaker with their speech signal.
☆15Sep 19, 2022Updated 3 years ago
liyunlongaaa / NSD-MS2S
View on GitHub
CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence ar…
☆88Jun 17, 2025Updated last year
joonaskalda / PixIT
View on GitHub
Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…
☆105Jan 10, 2025Updated last year
zhenghuatan / rVADfast
View on GitHub
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…
☆154Updated this week