zilliz-bootcamp / audio_searchLinks
This project use PANNs for audio tagging and sound event detection, and finally get audio embeddings. Then Milvus is used to search the similarity audio items.
☆26Updated 4 years ago
Alternatives and similar repositories for audio_search
Users that are interested in audio_search are comparing it to the libraries listed below
Sorting:
- 语音识别模型pytorch转ONNX转MNN,C++实现部署☆73Updated 2 years ago
- libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻☆62Updated 2 years ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆107Updated 2 years ago
- A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.☆70Updated this week
- Port of Funasr's Paraformer model in C/C++☆33Updated last year
- ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).☆75Updated 3 weeks ago
- 本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法☆281Updated 2 months ago
- ☆76Updated 3 years ago
- A Tiny Project For ASR model training and Deployment☆27Updated 2 years ago
- ☆9Updated 5 years ago
- some ncnn demos of FunASR☆26Updated 10 months ago
- paraformer(chinense asr) online onnx runtime for python☆50Updated last year
- C/C++实现Python音频处理库librosa中melspectrogram的计算过程☆31Updated 3 years ago
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆32Updated 2 years ago
- OpenSpeaker is a completely independent and open source speaker recognition project. It provides the entire process of speaker recognitio…☆64Updated 3 years ago
- convert spleeter pretrained model to pytorch and onnx, then convert to mnn☆20Updated 4 years ago
- Clone a voice in 5 seconds to generate arbitrary speech in real-time☆34Updated 5 years ago
- 超快的中文普通话TTS☆120Updated 4 years ago
- ☆32Updated 4 years ago
- 端到端语音唤醒工具箱,从模型训练到模型推理。☆124Updated this week
- A library for adding punctuation into a text from ASR.☆19Updated 2 years ago
- ☆59Updated last year
- Python的音频工具☆15Updated 8 months ago
- Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released☆12Updated 3 years ago
- chinese real time voice cloning☆38Updated 5 years ago
- PaddleSpeech TTS cpp☆41Updated 2 years ago
- 基于Flask Web的中文自动语音识别演示系统,包含语音识别、语音合成、声纹识别之说话人识别。☆172Updated last year
- (已过时)WaveNet 声码器☆21Updated 5 years ago
- ncnn HiFi-GAN☆26Updated 10 months ago
- A modified version of vid2vid for Speech2Video, Text2Video Paper☆35Updated 2 years ago