Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
☆41Jul 25, 2025Updated 9 months ago
Alternatives and similar repositories for WhisperSeg
Users that are interested in WhisperSeg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆23Nov 25, 2024Updated last year
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆17Aug 1, 2024Updated last year
- A core package for acoustic communication research in Python☆43Feb 24, 2026Updated 2 months ago
- Visualization and analysis tool for passive acoustic data☆20Apr 24, 2026Updated 2 weeks ago
- Simple python algorithms for segmenting animal (songbird, mice) vocalizations into notes and syllables using Dynamic Thresholding and Con…☆27Apr 12, 2021Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Audio Annotation Tool for ML development☆89Apr 12, 2026Updated 3 weeks ago
- Code for Interspeech2022 paper DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion☆13May 6, 2023Updated 3 years ago
- BioAcoustic Collection Pipeline☆61Apr 30, 2026Updated last week
- This repository gathers the list of online publicly available bioacoustics datasets that can be used together with deep learning.☆41Jan 28, 2026Updated 3 months ago
- Pytorch implementation of "spectro-temporal attention-based voice activity detection"☆13Jun 4, 2024Updated last year
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆154Jun 5, 2025Updated 11 months ago
- Fork of Liu Feng's CoverHunter to run on a single computer, plus more features and documentation.☆21Apr 20, 2026Updated 2 weeks ago
- acoss: Audio Cover Song Suite is a framework for feature extraction and benchmarking for the cover song identification (CSI) task☆39Jul 6, 2023Updated 2 years ago
- C++ xtensor bindings to popular machine learning frameworks (TensorFlow & PyTorch)☆14Apr 8, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- denoising methods used in animal vocalization denoising☆25Dec 3, 2025Updated 5 months ago
- Library for building reproducible data pipelines to support experimentation☆20Dec 16, 2015Updated 10 years ago
- ChatTube: A Retrieval QA System to Youtube Videos☆10Jun 6, 2023Updated 2 years ago
- A scalable solution that simplifies the integration of ComfyUI for developers☆11Jul 15, 2024Updated last year
- Thesia is a Multi-track Spectrogram / Waveform viewer☆18Feb 22, 2026Updated 2 months ago
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆29Jul 9, 2024Updated last year
- Hybrid convolutional-recurrent neural networks for segmentation of birdsong and classification of elements☆56Feb 10, 2023Updated 3 years ago
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆15Dec 23, 2024Updated last year
- ☆13May 23, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A Differentiable Acoustic Guitar Model for String-Specific Polyphonic Synthesis☆18Nov 16, 2023Updated 2 years ago
- Code and dataset for Polyglot Prompting: Multilingual Multitask Prompt Training.☆18Dec 7, 2022Updated 3 years ago
- state-of-the-art models for diacritics restoration for Arabic language☆17Feb 23, 2025Updated last year
- Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)☆10Apr 29, 2024Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆118Mar 1, 2026Updated 2 months ago
- Speaker Verification using Pytorch☆13May 23, 2024Updated last year
- Transcribe desktop audio/computer audio in real-time and locally (Streaming ASR), using TorchAudio and Emformer-RNNT model for inference,…☆14May 7, 2024Updated 2 years ago
- A neural network framework for researchers studying acoustic communication☆91Mar 13, 2026Updated last month
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Dataset for Cover Song Identification and Understanding☆65Feb 23, 2023Updated 3 years ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆94Oct 18, 2023Updated 2 years ago
- ☆45Dec 15, 2022Updated 3 years ago
- Fast Punctuation Restoration using Transformer Models for Vietnamese☆11Jun 10, 2022Updated 3 years ago
- Real Time Chat Application☆14Dec 20, 2022Updated 3 years ago
- It is fine-tune the GPT-Neo model for Thai language.☆12Jun 30, 2021Updated 4 years ago