nianlonggu/WhisperSeg

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nianlonggu/WhisperSeg)

nianlonggu / WhisperSeg

Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

☆42

Alternatives and similar repositories for WhisperSeg

Users that are interested in WhisperSeg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DBD-research-group / BioFoundation
View on GitHub
☆16May 7, 2026Updated 2 months ago
timsainb / birdbrain
View on GitHub
A library for viewing songbird brain atlases (European starling, Canary, Zebra finch, Pigeon, Mustached bat)
☆23Sep 10, 2019Updated 6 years ago
smcgill3 / zeromq-matlab
View on GitHub
ZeroMQ mex bindings for MATLAB
☆21Feb 2, 2015Updated 11 years ago
timsainb / vocalization-segmentation
View on GitHub
Simple python algorithms for segmenting animal (songbird, mice) vocalizations into notes and syllables using Dynamic Thresholding and Con…
☆27Apr 12, 2021Updated 5 years ago
a43992899 / DeID-VC
View on GitHub
Code for Interspeech2022 paper DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
☆13May 6, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
janclemenslab / das
View on GitHub
Deep Audio Segmenter
☆34Mar 15, 2026Updated 4 months ago
KrishnaDN / Attentive-Statistics-Pooling-for-Deep-Speaker-Embedding
View on GitHub
Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch
☆49Jun 4, 2020Updated 6 years ago
Yifei-ZHAO96 / STAM-pytorch
View on GitHub
Pytorch implementation of "spectro-temporal attention-based voice activity detection"
☆13Jun 4, 2024Updated 2 years ago
mbsantiago / whombat
View on GitHub
Audio Annotation Tool for ML development
☆91Jul 8, 2026Updated 3 weeks ago
aws-samples / amazon-sagemaker-finetune-deploy-whisper-huggingface
View on GitHub
This is a demo project showing how to fine-tune and deploy the Whisper model on SageMaker.
☆26Dec 20, 2023Updated 2 years ago
xaviermouy / SoundScope
View on GitHub
Visualization and analysis tool for passive acoustic data
☆21Jun 28, 2026Updated last month
furkanyesiler / acoss
View on GitHub
acoss: Audio Cover Song Suite is a framework for feature extraction and benchmarking for the cover song identification (CSI) task
☆40Jul 6, 2023Updated 3 years ago
SELMA-project / ml4audio
View on GitHub
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Sep 4, 2023Updated 2 years ago
bioacoustic-ai / bacpipe
View on GitHub
BioAcoustic Collection Pipeline
☆67Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bioacoustic-ai / bioacoustics-datasets
View on GitHub
This repository gathers the list of online publicly available bioacoustics datasets that can be used together with deep learning.
☆44May 26, 2026Updated 2 months ago
wolfv / xtensor-ml
View on GitHub
C++ xtensor bindings to popular machine learning frameworks (TensorFlow & PyTorch)
☆14Apr 8, 2022Updated 4 years ago
lifewatch / pypam
View on GitHub
Python Passive Acoustic Analysis tool for Passive Acoustic Monitoring (PAM)
☆53Jul 10, 2026Updated 2 weeks ago
Sreyan88 / LAPE
View on GitHub
A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)
☆29Jul 9, 2024Updated 2 years ago
GTSuya-Studio / ComfyUI-Gtsuya-Nodes
View on GitHub
A set of custom nodes for ComfyUI - focused on wildcards utilities
☆14Mar 29, 2026Updated 4 months ago
OmkarAcharekar / ChatValve
View on GitHub
Real Time Chat Application
☆14Dec 20, 2022Updated 3 years ago
wxqwinner / silero-vad-ncnn
View on GitHub
Silero VAD(ncnn): pre-trained enterprise-grade Voice Activity Detector.
☆26Aug 21, 2024Updated last year
vocalpy / vak
View on GitHub
A neural network framework for researchers studying acoustic communication
☆92Mar 13, 2026Updated 4 months ago
edenartlab / flux-trainer
View on GitHub
Eden Flux LoRA trainer and full-finetuning
☆23Mar 21, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Jordain / Comfy_Image_Workshop
View on GitHub
A scalable solution that simplifies the integration of ComfyUI for developers
☆11Jul 15, 2024Updated 2 years ago
andywiggins / ddsp-guitar-synth
View on GitHub
A Differentiable Acoustic Guitar Model for String-Specific Polyphonic Synthesis
☆18Nov 16, 2023Updated 2 years ago
ljvillanueva / pumilio
View on GitHub
Pumilio: A Web-Based Management System for Ecological Recordings
☆13Oct 29, 2018Updated 7 years ago
pyannote / pyannote-database
View on GitHub
Reproducible experimental protocols for multimedia (audio, video, text) database
☆120Mar 1, 2026Updated 4 months ago
jinlanfu / Polyglot_Prompt
View on GitHub
Code and dataset for Polyglot Prompting: Multilingual Multitask Prompt Training.
☆18Dec 7, 2022Updated 3 years ago
aranciokov / FSMMDA_VideoRetrieval
View on GitHub
☆10Nov 23, 2023Updated 2 years ago
Intersection98 / ComfyUI_MX_post_processing-nodes
View on GitHub
☆13May 23, 2024Updated 2 years ago
axeber01 / wav2pos
View on GitHub
3D Sound Source Localization using Masked Autoencoders
☆21Feb 12, 2025Updated last year
livingingroups / animal2vec
View on GitHub
animal2vec: A self-supervised transformer for rare-event raw audio input
☆32Dec 15, 2025Updated 7 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
AlessioMichelassi / openPyVision_013
View on GitHub
Welcome to my project. OpenPyVision is a real time videoMixer based on opencv and pyqt6.
☆14Aug 22, 2024Updated last year
MTG / da-tacos
View on GitHub
A Dataset for Cover Song Identification and Understanding
☆66Feb 23, 2023Updated 3 years ago
FrenchKrab / IS2023-powerset-diarization
View on GitHub
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
☆96Oct 18, 2023Updated 2 years ago
skakouros / s3prl_attentive_correlation
View on GitHub
Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit
☆13Nov 18, 2022Updated 3 years ago
krylm / whisper-event-tuning
View on GitHub
Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.
☆12Dec 24, 2022Updated 3 years ago
Open-Speech-EkStep / indic-punct
View on GitHub
☆45Dec 15, 2022Updated 3 years ago
wannaphong / thaigpt-next
View on GitHub
It is fine-tune the GPT-Neo model for Thai language.
☆12Jun 30, 2021Updated 5 years ago