NINAnor/rare_species_detections

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NINAnor/rare_species_detections)

NINAnor / rare_species_detections

Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complete the DCASE2023 challenge on few-shot bioacoustic events.

☆34

Alternatives and similar repositories for rare_species_detections

Users that are interested in rare_species_detections are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RetroCirce / HTS-Audio-Transformer
View on GitHub
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
☆502Sep 18, 2025Updated 10 months ago
diggerdu / AudioMamba
View on GitHub
☆12Jun 1, 2024Updated 2 years ago
YuanX9 / ShipsEar-An-Unofficial-Train-Test-Split
View on GitHub
An unofficial train-test split for ShipsEar: An underwater vessel noise database
☆26Jul 31, 2024Updated last year
cristinae / ASRdys
View on GitHub
ASR for dysarthric speakers with Kaldi
☆13Jan 14, 2017Updated 9 years ago
haoheliu / DCASE_2022_Task_5
View on GitHub
System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection
☆28Jul 6, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
the-bird-F / GLM-Voice-RAG
View on GitHub
[EMNLP 2025 Findings] A complete cross-modal RAG system for end-to-end speech-to-speech large models, including ASR-based Retrieval and E…
☆31Jul 11, 2025Updated last year
Audio-WestlakeU / ATST-SED
View on GitHub
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
☆172Jun 8, 2026Updated last month
zhaoyanpeng / vipant
View on GitHub
VIsually-Pivoted Audio and(N) Text
☆22May 16, 2022Updated 4 years ago
Vyvo-Labs / CodecHub
View on GitHub
CodecHub: A Unified Library for Codec Models
☆25Dec 24, 2025Updated 6 months ago
zexupan / avse_hybrid_loss
View on GitHub
☆16Jun 15, 2022Updated 4 years ago
ajd12342 / paraspeechclap
View on GitHub
Codebase for 'ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining'
☆23Jun 20, 2026Updated last month
BirdVox / birdvoxclassify
View on GitHub
A pre-trained deep learning system for classifying bird flight calls in audio clips
☆20Nov 16, 2021Updated 4 years ago
RicherMans / CED
View on GitHub
Source code for Consistent ensemble distillation for audio tagging
☆75Mar 20, 2026Updated 4 months ago
deegy666 / ADD-RSC
View on GitHub
Code repository for ‘Adaptive Differential Denoising for Respiratory Sounds Classification’
☆22Dec 19, 2025Updated 7 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
inverse-ai / FINALLY-Speech-Enhancement
View on GitHub
FINALLY: Fast and universal speech enhancement model delivering studio-quality audio for a wide range of recordings.
☆28Apr 1, 2026Updated 3 months ago
bghani / xcapi
View on GitHub
xcapi: A Python package for downloading animal sound recordings from xeno-canto API.
☆20May 29, 2026Updated last month
microsoft / SPARROW
View on GitHub
SPARROW — Solar-Powered Acoustic and Remote Recording Observation Watch. An AI-enabled edge device for biodiversity monitoring with camer…
☆49Updated this week
gaborfodor / wave-bird-recognition
View on GitHub
☆20Jun 2, 2021Updated 5 years ago
zhaoyanpeng / audioset-dl
View on GitHub
Download AudioSet for Vision-Audio-Text Pre-training
☆13May 16, 2022Updated 4 years ago
DBD-research-group / BioFoundation
View on GitHub
☆16May 7, 2026Updated 2 months ago
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
AI-S2-Lab / GPT-Talker
View on GitHub
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆45Oct 28, 2024Updated last year
jweihe / ADA-GAD
View on GitHub
Official PyTorch implementation for the paper ADA-GAD: Anomaly-Denoised Autoencoders for Graph Anomaly Detection (AAAI 2024).
☆33May 14, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
PoTaTo-Mika / Shore-Data-Engine
View on GitHub
A codebase for data crawling and preprocessing for TTS and ASR systems training.
☆23Jun 13, 2026Updated last month
c4dm / dcase-few-shot-bioacoustic
View on GitHub
☆61Jul 2, 2024Updated 2 years ago
fschmid56 / EfficientAT
View on GitHub
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …
☆353Nov 20, 2024Updated last year
JudeJiwoo / nmt
View on GitHub
☆15Apr 13, 2025Updated last year
JinhuaLiang / lam4fsl
View on GitHub
An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"
☆31May 31, 2023Updated 3 years ago
AmphionTeam / SD-Eval
View on GitHub
[NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
☆57Jun 25, 2024Updated 2 years ago
cpdu / vallt
View on GitHub
☆36Mar 14, 2025Updated last year
Sara-Ahmed / ASiT
View on GitHub
ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation
☆30Mar 10, 2024Updated 2 years ago
RicherMans / Datadriven-GPVAD
View on GitHub
The codebase for Data-driven general-purpose voice activity detection.
☆93Aug 3, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jlingohr / magenta-torch
View on GitHub
Pytorch Implementation of MusicVAE
☆16May 4, 2019Updated 7 years ago
WxxShirley / KDD2024ProCom
View on GitHub
Codes and data for KDD 2024 Research Track paper "ProCom: A Few-shot Targeted Community Detection Algorithm"
☆11Aug 15, 2024Updated last year
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
ffxiong / uaspeech
View on GitHub
Baseline kaldi script for UA-SPEECH corpus
☆32Oct 16, 2024Updated last year
yihuitang / StyleTTS_Mandarin
View on GitHub
Implementation of StyleTTS for Mandarin
☆11Jun 22, 2023Updated 3 years ago
nii-yamagishilab / SSL-SAS
View on GitHub
Language independent SSL-based Speaker Anonymization system
☆20May 28, 2024Updated 2 years ago
mkunes / w2v2_audioFrameClassification
View on GitHub
wav2vec2 audio classification for prosodic boundary detection and other tasks
☆42Aug 11, 2023Updated 2 years ago