earthspecies / ispa
☆24Updated 5 months ago
Alternatives and similar repositories for ispa:
Users that are interested in ispa are comparing it to the libraries listed below
- This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.☆28Updated last week
- AVES: Animal Vocalization Encoder based on Self-Supervision☆107Updated last week
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆95Updated 9 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆46Updated 6 months ago
- BEANS: The Benchmark of Animal Sounds☆97Updated 6 months ago
- Official implementation of MelHuBERT☆65Updated 5 months ago
- Collection of scripts from mHuBERT-147.☆24Updated 5 months ago
- Layer-wise analysis of self-supervised pre-trained speech representations☆101Updated 6 months ago
- ARCH: Audio Representations benCHmark☆44Updated 7 months ago
- Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complet…☆34Updated last month
- SA-toolkit: Speaker speech anonymization toolkit in python☆23Updated last month
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Updated last year
- SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)☆70Updated 3 months ago
- This is the M-AILABS Speech Dataset☆57Updated 4 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆123Updated 4 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆126Updated 4 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆51Updated last month
- ☆43Updated 10 months ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆98Updated 3 months ago
- ☆59Updated last week
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Updated last year
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆54Updated last month
- [ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".☆17Updated last month
- ☆59Updated 5 months ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆49Updated last month
- High-Fidelity Neural Phonetic Posteriorgrams☆109Updated 2 months ago
- Survey on speech generation work.☆17Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆86Updated 4 months ago
- ☆28Updated last year
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆49Updated last year