Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
☆41Jul 25, 2025Updated 8 months ago
Alternatives and similar repositories for WhisperSeg
Users that are interested in WhisperSeg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆21Nov 25, 2024Updated last year
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆17Aug 1, 2024Updated last year
- A core package for acoustic communication research in Python☆43Feb 24, 2026Updated last month
- Visualization and analysis tool for passive acoustic data☆20Updated this week
- Simple python algorithms for segmenting animal (songbird, mice) vocalizations into notes and syllables using Dynamic Thresholding and Con…☆27Apr 12, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Audio Annotation Tool for ML development☆88Apr 12, 2026Updated last week
- BioAcoustic Collection Pipeline☆57Apr 10, 2026Updated last week
- This repository gathers the list of online publicly available bioacoustics datasets that can be used together with deep learning.☆40Jan 28, 2026Updated 2 months ago
- Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch☆49Jun 4, 2020Updated 5 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆152Jun 5, 2025Updated 10 months ago
- 3D Sound Source Localization using Masked Autoencoders☆19Feb 12, 2025Updated last year
- This is a demo project showing how to fine-tune and deploy the Whisper model on SageMaker.☆25Dec 20, 2023Updated 2 years ago
- Fork of Liu Feng's CoverHunter to run on a single computer, plus more features and documentation.☆21Mar 27, 2026Updated 3 weeks ago
- acoss: Audio Cover Song Suite is a framework for feature extraction and benchmarking for the cover song identification (CSI) task☆39Jul 6, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- C++ xtensor bindings to popular machine learning frameworks (TensorFlow & PyTorch)☆14Apr 8, 2022Updated 4 years ago
- denoising methods used in animal vocalization denoising☆25Dec 3, 2025Updated 4 months ago
- animal2vec: A self-supervised transformer for rare-event raw audio input☆31Dec 15, 2025Updated 4 months ago
- A Differentiable Acoustic Guitar Model for String-Specific Polyphonic Synthesis☆17Nov 16, 2023Updated 2 years ago
- Pre-trained models for bioacoustic classification tasks☆63Updated this week
- Thesia is a Multi-track Spectrogram / Waveform viewer☆18Feb 22, 2026Updated last month
- Python Passive Acoustic Analysis tool for Passive Acoustic Monitoring (PAM)☆50Mar 30, 2026Updated 2 weeks ago
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆29Jul 9, 2024Updated last year
- Hybrid convolutional-recurrent neural networks for segmentation of birdsong and classification of elements☆56Feb 10, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- state-of-the-art models for diacritics restoration for Arabic language☆17Feb 23, 2025Updated last year
- Collection of notebooks exploring conv nets in detail.☆10Sep 14, 2017Updated 8 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆116Mar 1, 2026Updated last month
- Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)☆10Apr 29, 2024Updated last year
- Speaker Verification using Pytorch☆13May 23, 2024Updated last year
- A neural network framework for researchers studying acoustic communication☆91Mar 13, 2026Updated last month
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Dec 24, 2022Updated 3 years ago
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 3 years ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A Dataset for Cover Song Identification and Understanding☆65Feb 23, 2023Updated 3 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆94Oct 18, 2023Updated 2 years ago
- ☆45Dec 15, 2022Updated 3 years ago
- Fast Punctuation Restoration using Transformer Models for Vietnamese☆11Jun 10, 2022Updated 3 years ago
- It is fine-tune the GPT-Neo model for Thai language.☆12Jun 30, 2021Updated 4 years ago
- Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit☆13Nov 18, 2022Updated 3 years ago
- Thai-English transliteration dictionary☆17Jun 24, 2022Updated 3 years ago