shuaijiang / Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
☆163Updated 3 months ago
Related projects: ⓘ
- ☆463Updated 3 months ago
- KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-…☆483Updated 8 months ago
- Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…☆813Updated 2 months ago
- 基于 faster-whisper 的伪实时语音转写服务☆160Updated this week
- 基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏☆234Updated last year
- ☆258Updated last month
- API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…☆129Updated 2 weeks ago
- 中文标点符号模型,可以给文本添加标点符号。☆128Updated 6 months ago
- 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a s…☆491Updated 4 months ago
- It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。☆90Updated last week
- TTS appalication based on modelscope KAN-TTS☆43Updated 5 months ago
- 第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷 的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。☆519Updated last year
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆199Updated 2 months ago
- Text Normalization & Inverse Text Normalization☆445Updated 2 weeks ago
- Production First and Production Ready End-to-End Text-to-Speech Toolkit☆367Updated 3 months ago
- CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models☆390Updated 5 months ago
- TTS☆49Updated 3 months ago
- SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.☆426Updated last week
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,069Updated last month
- chinese speech pretrained models☆1,004Updated 3 weeks ago
- 这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小…☆482Updated last year
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,390Updated 2 months ago
- Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!☆1,146Updated 7 months ago
- 本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法☆213Updated 2 weeks ago
- Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高☆458Updated 6 months ago
- a TTS demo for training new characters.☆432Updated 8 months ago
- 端到端语音唤醒工具箱,从模型训练到模型推理。☆64Updated 2 weeks ago
- Phi3 中文仓库☆313Updated 4 months ago
- A 10000+ hours dataset for Chinese speech recognition☆490Updated last year
- Chat with any character you like: ChatGLM2+SadTalker+Voice Cloning | 和喜欢的角色沉浸式对话吧:ChatGLM2+声音克隆+视频对话☆584Updated last year