v3ucn/ASR_TOOLS_SenseVoice_WebUI

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/v3ucn/ASR_TOOLS_SenseVoice_WebUI)

v3ucn / ASR_TOOLS_SenseVoice_WebUI

Bert-vits2转写和标注独立整合Webui,整合阿里FunAsr,必剪Asr以及Whisper大模型

☆182

Alternatives and similar repositories for ASR_TOOLS_SenseVoice_WebUI

Users that are interested in ASR_TOOLS_SenseVoice_WebUI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

v3ucn / Fix-Loudness
View on GitHub
音频响度统一，音量归一化处理
☆13May 3, 2024Updated 2 years ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
v3ucn / Bert-VITS2-Extra_-
View on GitHub
Bert-VITS2-Extra_中文特化版本训练和推理
☆26Feb 10, 2024Updated 2 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆27Jun 6, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
v3ucn / CosyVoice_For_Windows
View on GitHub
CosyVoice在Windows环境下使用的版本
☆766Nov 19, 2024Updated last year
v3ucn / Bert-vits2-V2.3
View on GitHub
Bert-vits2-V2.3 训练和推理
☆49Mar 13, 2024Updated 2 years ago
pengzhendong / asr-decoder
View on GitHub
CTC decoder with hotwords for ASR.
☆38Jun 15, 2026Updated last month
v3ucn / zhangyimou_voice_clone_text
View on GitHub
张艺谋(国师)一键声音克隆和恶搞文本生成项目
☆17Jun 15, 2023Updated 3 years ago
lovemefan / CT-Transformer-punctuation
View on GitHub
A enterprise-grade Chinese-English code switch punctuator from funasr.
☆34Apr 26, 2024Updated 2 years ago
v3ucn / Modelscope_Faster_Whisper_Multi_Subtitle
View on GitHub
基于Faster-whisper和modelscope一键生成双语字幕，双语字幕生成器,基于离线大模型,Generate bilingual subtitles with one click based on Faster-whisper and modelscope. O…
☆419Dec 1, 2024Updated last year
cronrpc / Audio-Speaker-Needle-In-Haystack
View on GitHub
Finding the most similar tone/color in a large collection of audio. 在一大堆音频中寻找最相似的音色。
☆13Jun 17, 2024Updated 2 years ago
pengzhendong / audio-pipeline
View on GitHub
☆23Oct 17, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
SoonSYJ / fawasr
View on GitHub
FunASR安卓端侧离线版本2pass全模式
☆15Sep 4, 2023Updated 2 years ago
lovemefan / fsmn-vad
View on GitHub
A enterprise-grade Voice Activity Detector from modelscope and funasr.
☆139Apr 26, 2023Updated 3 years ago
MorenoLaQuatra / vad
View on GitHub
Simple voice activity detection (VAD) algorithm in Python
☆15Aug 10, 2023Updated 2 years ago
tonnetonne814 / PL-Bert-VITS2
View on GitHub
VITS2 using Phoneme-Level Japanese BERT
☆14Dec 17, 2023Updated 2 years ago
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
lovemefan / paraformer-python
View on GitHub
paraformer(chinense asr) online onnx runtime for python
☆54Mar 27, 2024Updated 2 years ago
litagin02 / laughter-collector
View on GitHub
大量の音声データから笑い声部分を集めるやつ
☆14May 23, 2024Updated 2 years ago
v3ucn / IMAGdressing_WebUi_For_Windows
View on GitHub
IMAGdressing在Windows环境下运行的webui界面
☆21Jul 25, 2024Updated 2 years ago
lovemefan / SenseVoice-python
View on GitHub
SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime
☆114Jun 12, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lifeiteng / Aligner-SUPERB
View on GitHub
Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark
☆39May 7, 2025Updated last year
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
yxlllc / ReFlow-VAE-SVC
View on GitHub
☆158Feb 6, 2025Updated last year
LAION-AI / emotional-speech-annotations
View on GitHub
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
☆35Oct 13, 2024Updated last year
v3ucn / Bert-vits2-Extra-Stream-webui-api
View on GitHub
基于Bert-vits2-Extra项目添加的流式推理和流式接口api功能
☆16Apr 12, 2024Updated 2 years ago
0x5446 / api4sensevoice
View on GitHub
API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…
☆538Oct 23, 2024Updated last year
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
shivammehta25 / BetterFastSpeech2
View on GitHub
Just another FastSpeech 2 but cleaner code :)
☆29Jun 28, 2024Updated 2 years ago
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
multimodal-art-projection / Open-Suno
View on GitHub
trying to reproduce suno v3
☆34Jan 29, 2025Updated last year
TylorShine / MNP-SVC
View on GitHub
Real-time end-to-end singing voice convertion
☆25Nov 3, 2024Updated last year
mozillazg / pypinyin-g2pW
View on GitHub
基于 g2pW 提升 pypinyin 的准确性
☆104Jun 24, 2023Updated 3 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
QwenAudio / SenseVoice
View on GitHub
Open-source SenseVoiceSmall model for Mandarin, Cantonese, English, Japanese, and Korean ASR, language ID, emotion recognition, and audio…
☆8,935Updated this week
TomJwYu / WenetSpeechSpeakerCluster
View on GitHub
☆55Jul 17, 2023Updated 3 years ago