ai-bot-pro / achatbot
An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.
☆25Updated this week
Alternatives and similar repositories for achatbot:
Users that are interested in achatbot are comparing it to the libraries listed below
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆83Updated 4 months ago
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆122Updated 8 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆39Updated 4 months ago
- We Speech Transcript based on LLM, in 300 lines of code.☆142Updated 2 weeks ago
- Real time faster whisper gradio☆26Updated 4 months ago
- Streaming ASR and TTS based on FastAPI+ sherpa-onnx☆73Updated 4 months ago
- Have a natural voice conversation with an LLM☆237Updated 2 months ago
- flow mirror models from JZX AI Labs☆42Updated 4 months ago
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆81Updated last year
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆75Updated last year
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆51Updated 3 years ago
- 用于SenseVoice的api项目,输出带时间戳字幕☆24Updated 3 months ago
- A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.☆38Updated 3 weeks ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆57Updated last week
- Running the F5-TTS by ONNX Runtime☆104Updated last week
- F5-TTS 推理加速,速度提升约4倍!☆44Updated last month
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆27Updated 5 months ago
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆26Updated 4 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆21Updated last year
- A gradio webui for Andrewyng translation-agent☆27Updated 2 months ago
- 基于FunASR实现语音识别,包含常规版和ONNX版(推荐)。☆28Updated 4 months ago
- CTC decoder with hotwords for ASR.☆16Updated 3 weeks ago
- Example agents I've built using the LiveKit Agents (https://github.com/livekit/agents) framework☆18Updated 9 months ago
- ☆9Updated 7 months ago
- LSLM implements full duplex modeling in interactive speech language models, based on research by Ma et al. (2024). This project advances …☆62Updated last month
- Efficient approach to speaker diarization using voice characteristics extraction☆88Updated 9 months ago
- A lightweight end-to-end text-to-speech model☆102Updated last month
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆35Updated last week
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆31Updated 2 months ago