Bert-vits2转写和标注独立整合Webui,整合阿里FunAsr,必剪Asr以及Whisper大模型
☆184Jul 10, 2024Updated last year
Alternatives and similar repositories for ASR_TOOLS_SenseVoice_WebUI
Users that are interested in ASR_TOOLS_SenseVoice_WebUI are comparing it to the libraries listed below
Sorting:
- Compute WER and SER for speech recognition evaluation☆26Dec 15, 2025Updated 2 months ago
- CTC decoder with hotwords for ASR.☆34Apr 13, 2025Updated 10 months ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆35May 7, 2025Updated 9 months ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- 音频响度统一,音量归一化处理☆12May 3, 2024Updated last year
- Finding the most similar tone/color in a large collection of audio. 在一大堆音频中寻找最相似的音色。☆13Jun 17, 2024Updated last year
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- ☆23Oct 17, 2024Updated last year
- Bert-VITS2-Extra_中文特化版本 训练和推理☆26Feb 10, 2024Updated 2 years ago
- Bert-vits2-V2.3 训练和推理☆50Mar 13, 2024Updated last year
- 基于Faster-whisper和modelscope一键生成双语字幕,双语字幕生成器,基于离线大模型,Generate bilingual subtitles with one click based on Faster-whisper and modelscope. O…☆414Dec 1, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- CosyVoice在Windows环境下使用的版本☆754Nov 19, 2024Updated last year
- A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)☆38Jan 15, 2026Updated last month
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- Simple voice activity detection (VAD) algorithm in Python☆15Aug 10, 2023Updated 2 years ago
- 基于 g2pW 提升 pypinyin 的准确性☆104Jun 24, 2023Updated 2 years ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆129Apr 26, 2023Updated 2 years ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated last year
- 张艺谋(国师)一键声音克隆和恶搞文本生成项目☆17Jun 15, 2023Updated 2 years ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆109Oct 6, 2025Updated 4 months ago
- trying to reproduce suno v3☆35Jan 29, 2025Updated last year
- A enterprise-grade Chinese-English code switch punctuator from funasr.☆31Apr 26, 2024Updated last year
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- FunASR安卓端侧离线版本2pass全模式☆14Sep 4, 2023Updated 2 years ago
- text to speech☆10Mar 19, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- ☆36Sep 6, 2025Updated 5 months ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆111Dec 20, 2024Updated last year
- 🇺🇦 Open Source Ukrainian Text-to-Speech datasets☆22Feb 24, 2025Updated last year
- ☆19Mar 22, 2024Updated last year
- Real-time end-to-end singing voice convertion☆24Nov 3, 2024Updated last year
- paraformer(chinense asr) online onnx runtime for python☆53Mar 27, 2024Updated last year
- ☆14Jun 16, 2023Updated 2 years ago
- 大量の音声データから笑い声部分を集めるやつ☆12May 23, 2024Updated last year
- API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…☆538Oct 23, 2024Updated last year