zilliz-bootcamp / audio_search
This project use PANNs for audio tagging and sound event detection, and finally get audio embeddings. Then Milvus is used to search the similarity audio items.
☆22Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for audio_search
- Port of Funasr's Paraformer model in C/C++☆25Updated 4 months ago
- paraformer(chinense asr) online onnx runtime for python☆35Updated 7 months ago
- A Tiny Project For ASR model training and Deployment☆27Updated 2 years ago
- 语音识别模型pytorch转ONNX转MNN,C++实现部署☆40Updated 2 years ago
- Whisper in TensorRT-LLM☆14Updated last year
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆61Updated last year
- ☆30Updated 3 years ago
- A library for adding punctuation into a text from ASR.☆17Updated last year
- 端到端语音唤醒工具箱,从模型训练到模型推理。☆73Updated 2 months ago
- qwen2 and llama3 cpp implementation☆34Updated 5 months ago
- ☆74Updated 2 years ago
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆31Updated last year
- libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻☆56Updated last year
- Clone a voice in 5 seconds to generate arbitrary speech in real-time☆34Updated 4 years ago
- ☆36Updated 3 months ago
- 使用onnxruntime部署实时视频帧插值,包含C++和Python两个版本的程序☆22Updated 8 months ago
- ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).☆52Updated 2 months ago
- (已过时)WaveNet 声码器☆21Updated 4 years ago
- some ncnn demos of FunASR☆16Updated last month
- 百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,时延低至800ms,低配置也可运行,支持打断☆35Updated last month
- ☆9Updated 4 years ago
- ncnn HiFi-GAN☆24Updated last month
- ☆29Updated 5 years ago
- Wanwu models release, code will be released soon☆24Updated 2 years ago
- ASR client for Triton ASR Service☆18Updated 3 weeks ago
- one script for xls-r/xlsr/whisper fine-tuning☆39Updated last year
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆73Updated last month
- ChatTTS is a generative speech model for daily dialogue.☆12Updated 2 weeks ago
- A repository for Chinese text normalization.☆14Updated 3 years ago
- Podcast Summarizer with LLM Technology☆17Updated last year