Bert-vits2转写和标注独立整合Webui,整合阿里FunAsr,必剪Asr以及Whisper大模型
☆182Jul 10, 2024Updated last year
Alternatives and similar repositories for ASR_TOOLS_SenseVoice_WebUI
Users that are interested in ASR_TOOLS_SenseVoice_WebUI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated last month
- 音频响度统一,音量归一化处理☆13May 3, 2024Updated 2 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Bert-VITS2-Extra_中文特化版本 训练和推理☆26Feb 10, 2024Updated 2 years ago
- CTC decoder with hotwords for ASR.☆35Apr 13, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- CosyVoice在Windows环境下使用的版本☆764Nov 19, 2024Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Apr 4, 2024Updated 2 years ago
- FunASR安卓端侧离线版本2pass全模式☆15Sep 4, 2023Updated 2 years ago
- Bert-vits2-V2.3 训练和推理☆50Mar 13, 2024Updated 2 years ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆36May 7, 2025Updated 11 months ago
- 张艺谋(国师)一键声音克隆和恶搞文本生成项目☆17Jun 15, 2023Updated 2 years ago
- A enterprise-grade Chinese-English code switch punctuator from funasr.☆33Apr 26, 2024Updated 2 years ago
- 基于Faster-whisper和modelscope一键生成双语字幕,双语字幕生成器,基于离线大模型,Generate bilingual subtitles with one click based on Faster-whisper and modelscope. O…☆420Dec 1, 2024Updated last year
- A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)☆38Mar 31, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆23Oct 17, 2024Updated last year
- Finding the most similar tone/color in a large collection of audio. 在一大堆音频中寻找最相似的音色。☆13Jun 17, 2024Updated last year
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆134Apr 26, 2023Updated 3 years ago
- Simple voice activity detection (VAD) algorithm in Python☆15Aug 10, 2023Updated 2 years ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated last year
- paraformer(chinense asr) online onnx runtime for python☆54Mar 27, 2024Updated 2 years ago
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- IMAGdressing在Windows环境下运行的webui界面☆22Jul 25, 2024Updated last year
- 基于Bert-vits2-Extra项目添加的流式推理和流式接口api功能☆16Apr 12, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 🇺🇦 Open Source Ukrainian Text-to-Speech datasets☆26Feb 24, 2025Updated last year
- 大量の音声データから笑い声部分を集めるやつ☆12May 23, 2024Updated last year
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆112Oct 6, 2025Updated 7 months ago
- ☆36Sep 6, 2025Updated 8 months ago
- Multilingual Voice Understanding Model☆8,072Dec 30, 2025Updated 4 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ☆157Feb 6, 2025Updated last year
- API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…☆540Oct 23, 2024Updated last year
- trying to reproduce suno v3☆34Jan 29, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆61Sep 5, 2025Updated 8 months ago
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆60Apr 18, 2026Updated 2 weeks ago
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- ☆14Jun 16, 2023Updated 2 years ago
- Real-time end-to-end singing voice convertion☆25Nov 3, 2024Updated last year
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆111Dec 20, 2024Updated last year