该开源项目旨在提供一个能够自动检测并识别中文语音的模型,支持wav、mp4、m4a等格式的音频文件上传。无论是从录音设备中获取的wav文件,还是从视频中提取的mp4、m4a文件,我们的模型可以准确识别其中的中文文字内容。通过集成最先进的语音识别技术和深度学习算法,我们的模型能够快速、准确地将声音转换为文字,为用户提供便捷的语音识别体验。
☆44Jun 6, 2024Updated last year
Alternatives and similar repositories for voice_translation
Users that are interested in voice_translation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 不挂科AI后端是一个基于FastAPI框架构建的Web应用程序,旨在为用户提供一系列智能化的服务,包括视频转PPT、PPT转PDF、PDF和PPT内容解析、考试重点大纲生成、出题、思维导图生成等功能。该后端服务使用了多种Python库,如FastAPI、PyPDF2、pyt…☆16Oct 30, 2024Updated last year
- ☆13Mar 10, 2024Updated 2 years ago
- Code for ICCV 2023 work "Generalized Few-Shot Point Cloud Segmentation Via Geometric Words"☆13Sep 26, 2023Updated 2 years ago
- ☆10May 5, 2025Updated 11 months ago
- ☆15Apr 3, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- combine ASR, LLM and TTS in local development with python☆17Sep 21, 2024Updated last year
- 基于Gradio开发的ChatGPT聊天应用,可以文字 或 语音对话,发送的音频通过OpenAI的STT转文本后,再通过ChatGPT生成 回复,回复的内容通过OpenAI TTS合成后返回并自动播放,实现语音聊天功能。☆35Feb 18, 2024Updated 2 years ago
- 本地部署音视频转文字区分说话人+LLM总结 - Moded from FunClip - Offline video/auduio Transcription + SD + LLM conclusion☆51Jan 2, 2025Updated last year
- 基于LSTM的异常检测☆29Apr 30, 2019Updated 6 years ago
- Cook feeds☆10Nov 11, 2022Updated 3 years ago
- 智能视频处理系统☆47Dec 26, 2024Updated last year
- 因为市面上的拼豆软件差强人意 ,所以我基于开源项目:Zippland/perler-beads , 我加上AI优化了项目,AI辅助优化图片功能,写了一个专门生成拼豆图纸的网站。 经过大量测试,我觉得已经可以达到 一键生成拼豆图纸了!☆57Feb 26, 2026Updated last month
- To deploy Transformer models in CV to mobile devices.☆18Jan 20, 2022Updated 4 years ago
- ☆24May 12, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 凌波微步,一款在线刷步神器(目前支持微信,支付宝,QQ,阿里体育,钉钉...)☆10Jan 4, 2023Updated 3 years ago
- myblog☆12Feb 27, 2018Updated 8 years ago
- ☆12May 24, 2022Updated 3 years ago
- A translation tool based on Next.js and large AI models, capable of simultaneously calling translation results from multiple large models…☆16Dec 19, 2024Updated last year
- Test Framework for few-shot open set KWS☆42Nov 8, 2024Updated last year
- Research sources on graph-based anomaly detection☆13Nov 29, 2022Updated 3 years ago
- Code for running experiments and benchmarking on GNNExplainer: Generating Explanations for Graph Neural Networks☆15May 8, 2021Updated 4 years ago
- Automatic method for the recognition of hand gestures for the categorization of vowels and numbers in Colombian sign language based on Ne…☆15Nov 18, 2018Updated 7 years ago
- Recent papers on Graph Neural Networks-based Recommender System.☆12Aug 21, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆10Apr 5, 2023Updated 3 years ago
- [ICLR 2025 Spotlight] Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation☆69May 7, 2025Updated 11 months ago
- ☆19Nov 26, 2023Updated 2 years ago
- ☆12Sep 19, 2022Updated 3 years ago
- 基于 SiliconFlow API 的语音转文字桌面工具,支持 PyQt5 图形界面、音频文件批量转录和结果编辑管理 | A PyQt5 desktop app for speech-to-text transcription using SiliconFlow API☆20Dec 16, 2024Updated last year
- [MICCAI 2023] Multi-View Vertebra Localization and Identification from CT Images☆39Sep 19, 2024Updated last year
- Floating-Point Optimized On-Device Learning Library for the PULP Platform.☆42Mar 3, 2026Updated last month
- ☆29Feb 27, 2026Updated last month
- LaTeX OCR 的数据仓库☆140Jun 11, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Another implementation of the paper "Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs" in…☆13Jun 30, 2021Updated 4 years ago
- An open-source, easily accessible package for training and deploying Speech-to-Intent models on microcontrollers and SBCs☆50Mar 14, 2024Updated 2 years ago
- A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.☆18May 1, 2024Updated last year
- 基于UIE的舆论情感分析Web系统,前后端分离式架构部署,支持单文本属性级情感分析及上传txt文件进行批量情感分析,并支持分析结果的可视化展示。 技术栈:后端:FastAPI + UIE;前端:Vue + ElementUI + Echarts。☆67Apr 24, 2023Updated 2 years ago
- Clustering using Deep Learning (T-SNE visualization of autoencoder embeddings )☆10Mar 3, 2019Updated 7 years ago
- Anomaly Detection for time-series using Multilevel Wavelet Decomposition Networks.☆10Dec 11, 2019Updated 6 years ago
- ☆11Apr 8, 2023Updated 3 years ago