yeyupiaoling / Whisper-FinetuneLinks
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
☆1,176Updated last month
Alternatives and similar repositories for Whisper-Finetune
Users that are interested in Whisper-Finetune are comparing it to the libraries listed below
Sorting:
- Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…☆313Updated 3 weeks ago
- ☆823Updated last year
- 📣 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide …☆595Updated last year
- Pseudo Streaming SenseVoice with Hotwords☆419Updated 10 months ago
- Text Normalization & Inverse Text Normalization☆718Updated last month
- 基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型☆877Updated last month
- chinese speech pretrained models☆1,187Updated last year
- SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.☆535Updated 9 months ago
- Production First and Production Ready End-to-End Keyword Spotting Toolkit☆677Updated 4 months ago
- Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。☆718Updated last month
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆1,713Updated 3 weeks ago
- KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-…☆525Updated 2 years ago
- Speech-to-text server framework with next-gen Kaldi☆860Updated this week
- Port of Funasr's Sense-voice model in C/C++☆508Updated 3 weeks ago
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,860Updated last year
- ☆1,338Updated last month
- This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not exclud…☆1,222Updated last month
- Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!☆1,222Updated last year
- A 10000+ hours dataset for Chinese speech recognition☆584Updated last week
- 这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小…☆543Updated 2 years ago
- A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization☆2,727Updated last month
- API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…☆540Updated last year
- 使用vllm加速cosyvoice2的推理☆467Updated 8 months ago
- Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit☆1,149Updated 2 weeks ago
- The dataset of Speech Recognition☆444Updated last week
- Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, …☆1,604Updated 2 months ago
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.☆691Updated last month
- 基于 faster-whisper 的伪实时语音转写服务☆234Updated 8 months ago
- ☆377Updated last year
- 基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏☆275Updated 2 years ago