Gloridust / whisper_streaming_CN
Whisper realtime streaming for long speech-to-text transcription and translation
☆24Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for whisper_streaming_CN
- 基于 faster-whisper 的伪实时语音转写服务☆182Updated 2 months ago
- ChatTTS HTTP API☆48Updated 5 months ago
- 文本语料转训练集工具,txt转dataset☆78Updated 6 months ago
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆254Updated 4 months ago
- a gradio webui for faster whisper☆232Updated last year
- ☆56Updated 3 weeks ago
- Real time faster whisper gradio☆25Updated last month
- API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…☆222Updated 3 weeks ago
- TianMu: A modern AI tool with multi-platform support, markdown support, multimodal, continuous conversation, and customizable commands. 一…☆84Updated last year
- Analysis of Chinese and English layouts 中英文版面分析☆126Updated last month
- 一个用于CosyVoice的api接口项目☆79Updated 3 weeks ago
- app会常驻手机后台,你可以随时随地保持与Fay数字人的沟通。☆32Updated 2 months ago
- 📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。☆150Updated 2 weeks ago
- Implement OpenAI APIs and plugin-enabled ChatGPT with open source LLM and other models.☆122Updated 5 months ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆74Updated last month
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆24Updated 2 months ago
- Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…☆189Updated this week
- ☆287Updated 3 months ago
- Using GPT to parse PDF☆68Updated 2 months ago
- 从小说中提取对话数据集☆104Updated 5 months ago
- ☆100Updated 3 months ago
- Sample GLM4V + ChatTTS AI assistant☆84Updated 5 months ago
- 用于SenseVoice的api项目,输出带时间戳字幕☆13Updated 3 weeks ago
- ☆68Updated 10 months ago
- VideoFinder is an advanced video analysis tool powered by multimodal AI, designed to help users easily locate and identify specific objec…☆59Updated last week
- 阿里SenseVoice的fastpi封装,采用onnx发布,附带量化模型,支持GPU。支持从URL文件进行语音识 别。☆46Updated 2 months ago
- virtualwife-llm-factory 是一个llm训练框架,用于解决虚拟角色训练入门门槛高的问题,该框架具备自动生成语料,性格塑造评估,基于国产llm微调训练等核心能力,目前还在开发,可以点个star~ 关注一下☆31Updated 4 months ago
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆88Updated 3 months ago
- Pseudo Streaming SenseVoice with Hotwords☆85Updated 2 weeks ago