earthspecies / ispaLinks
☆25Updated 7 months ago
Alternatives and similar repositories for ispa
Users that are interested in ispa are comparing it to the libraries listed below
Sorting:
- AVES: Animal Vocalization Encoder based on Self-Supervision☆121Updated 2 months ago
- BEANS: The Benchmark of Animal Sounds☆104Updated 8 months ago
- ☆27Updated last week
- Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus☆22Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆97Updated 11 months ago
- animal2vec: A self-supervised transformer for rare-event raw audio input☆25Updated 4 months ago
- This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.☆34Updated 2 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆156Updated 6 months ago
- Contrastive language-audio pretraining for bioacoustics☆19Updated last year
- Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complet…☆34Updated 3 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆47Updated 8 months ago
- ☆27Updated last week
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated 2 years ago
- Official implementation for FlowSep☆52Updated 5 months ago
- ☆98Updated 2 months ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Updated last year
- Layer-wise analysis of self-supervised pre-trained speech representations☆104Updated 8 months ago
- Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.☆51Updated 2 months ago
- Simple diarization model☆50Updated 2 weeks ago
- ☆57Updated 2 years ago
- A simple library for Fréchet Audio Distance (FAD) calculation☆222Updated last month
- small audio language model for reasoning☆64Updated 2 months ago
- Speech Human Evaluation Estimation Toolkit (SHEET)☆87Updated 3 weeks ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆49Updated 3 months ago
- ARCH: Audio Representations benCHmark☆46Updated 10 months ago
- ☆44Updated last year
- This is the M-AILABS Speech Dataset☆68Updated 7 months ago
- Audio Annotation Tool for ML development☆67Updated last month
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆68Updated 2 months ago
- ☆114Updated 4 months ago