Bert-vits2转写和标注独立整合Webui,整合阿里FunAsr,必剪Asr以及Whisper大模型
☆182Jul 10, 2024Updated last year
Alternatives and similar repositories for ASR_TOOLS_SenseVoice_WebUI
Users that are interested in ASR_TOOLS_SenseVoice_WebUI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Compute WER and SER for speech recognition evaluation☆26Mar 18, 2026Updated 2 months ago
- 音频响度统一,音量归一化处理☆13May 3, 2024Updated 2 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Bert-VITS2-Extra_中文特化版本 训练和推理☆26Feb 10, 2024Updated 2 years ago
- CTC decoder with hotwords for ASR.☆36Apr 13, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- CosyVoice在Windows环境下使用的版本☆765Nov 19, 2024Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Apr 4, 2024Updated 2 years ago
- FunASR安卓端侧离线版本2pass全模式☆15Sep 4, 2023Updated 2 years ago
- Bert-vits2-V2.3 训练和推理☆50Mar 13, 2024Updated 2 years ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆36May 7, 2025Updated last year
- 张艺谋(国师)一键声音克隆和恶搞文本生成项目☆17Jun 15, 2023Updated 2 years ago
- A enterprise-grade Chinese-English code switch punctuator from funasr.☆33Apr 26, 2024Updated 2 years ago
- 基于Faster-whisper和modelscope一键生成双语字幕,双语字幕生成器,基于离线大模型,Generate bilingual subtitles with one click based on Faster-whisper and modelscope. O…☆420Dec 1, 2024Updated last year
- A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)☆38Mar 31, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆23Oct 17, 2024Updated last year
- Finding the most similar tone/color in a large collection of audio. 在一大堆音频中寻找最相似的音色。☆13Jun 17, 2024Updated last year
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆136Apr 26, 2023Updated 3 years ago
- Simple voice activity detection (VAD) algorithm in Python☆15Aug 10, 2023Updated 2 years ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated last year
- paraformer(chinense asr) online onnx runtime for python☆54Mar 27, 2024Updated 2 years ago
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- IMAGdressing在Windows环境下运行的webui界面☆22Jul 25, 2024Updated last year
- 基于Bert-vits2-Extra项目添加的流式推理和流式接口api功能☆16Apr 12, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 🇺🇦 Open Source Ukrainian Text-to-Speech datasets☆28Feb 24, 2025Updated last year
- 大量の音声データから笑い声部分を集めるやつ☆12May 23, 2024Updated 2 years ago
- Multilingual Voice Understanding Model☆8,216Updated this week
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆112Oct 6, 2025Updated 7 months ago
- ☆36Sep 6, 2025Updated 8 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ☆158Feb 6, 2025Updated last year
- API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…☆540Oct 23, 2024Updated last year
- trying to reproduce suno v3☆34Jan 29, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆61Sep 5, 2025Updated 8 months ago
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆65Apr 18, 2026Updated last month
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- Real-time end-to-end singing voice convertion☆25Nov 3, 2024Updated last year
- ☆14Jun 16, 2023Updated 2 years ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆111Dec 20, 2024Updated last year