zilliz-bootcamp / audio_searchLinks

This project use PANNs for audio tagging and sound event detection, and finally get audio embeddings. Then Milvus is used to search the similarity audio items.

☆26

Alternatives and similar repositories for audio_search

Users that are interested in audio_search are comparing it to the libraries listed below

Sorting:

Tzenthin / wenet_mnn
语音识别模型pytorch转ONNX转MNN，C++实现部署
☆71Updated 2 years ago
lovemefan / paraformer.cpp
Port of Funasr's Paraformer model in C/C++
☆32Updated last year
iwater / Real-Time-Voice-Cloning-Chinese
Clone a voice in 5 seconds to generate arbitrary speech in real-time
☆34Updated 5 years ago
lovemefan / fsmn-vad
A enterprise-grade Voice Activity Detector from modelscope and funasr.
☆105Updated 2 years ago
lovemefan / paraformer-python
paraformer(chinense asr) online onnx runtime for python
☆48Updated last year
RapidAI / RapidTTS
A cross platform implementation of Text-to-Speech based on ONNXRuntime.
☆32Updated 2 years ago
apinge / MeloTTS.cpp
A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.
☆69Updated 2 weeks ago
xiaominfc / melspectrogram_cpp
C/C++实现Python音频处理库librosa中melspectrogram的计算过程
☆31Updated 3 years ago
zycv / OpenSpeaker
OpenSpeaker is a completely independent and open source speaker recognition project. It provides the entire process of speaker recognitio…
☆64Updated 3 years ago
Sg4Dylan / libvits-ncnn
libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻
☆62Updated 2 years ago
chenyangMl / keyword-spot
端到端语音唤醒工具箱，从模型训练到模型推理。
☆121Updated 10 months ago
lucasjinreal / aural
A Tiny Project For ASR model training and Deployment
☆27Updated 2 years ago
duj12 / ASR-2Pass
ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
☆74Updated 3 months ago
QDPeng / Kaldi-NDK-Feature
☆9Updated 5 years ago
JasonWei512 / wavenet_vocoder
（已过时）WaveNet 声码器
☆21Updated 5 years ago
HaujetZhao / Chinese-ITN
中文逆文本正则化 (Chinese ITN, Chinese Inverse Text Normalization) ，即将文本中的中文数字转为阿拉伯数字。
☆15Updated last year
huismiling / wenet_trt8
☆75Updated 3 years ago
yeyupiaoling / YeAudio
Python的音频工具
☆15Updated 8 months ago
magicse / ncnn-hifi-GAN
ncnn HiFi-GAN
☆26Updated 9 months ago
IMLHF / Real-Time-Voice-Cloning
(pytorch) multi speaker TTS,
☆68Updated 5 years ago
julianyulu / SyncNetCN
Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released
☆12Updated 3 years ago
xingmegshuo / zhrtvc
chinese real time voice cloning
☆38Updated 5 years ago
pengzhendong / pysilero
Python Wrapper of Silero VAD
☆57Updated 2 months ago
xiayongtao / aidatatang_1505zh
☆30Updated 6 years ago
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆45Updated last year
leonardltk / Shazam-An-Industrial-Strength-Audio-Search-Algorithm-
Detecting segments belonging to which song in database, and return Nil if does not exist in a database.
☆21Updated 4 years ago
EdVince / whisper-trtllm
Whisper in TensorRT-LLM
☆16Updated last year
TeaPoly / speexdsp-ns-python
Python bindings of speexdsp noise suppression library
☆39Updated 2 years ago
RapidAI / RapidPunc
A library for adding punctuation into a text from ASR.
☆18Updated 2 years ago
lovemefan / SenseVoice-python
SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime
☆96Updated 9 months ago