earthspecies / ispa
☆24Updated 6 months ago
Alternatives and similar repositories for ispa
Users that are interested in ispa are comparing it to the libraries listed below
Sorting:
- AVES: Animal Vocalization Encoder based on Self-Supervision☆115Updated last month
- BEANS: The Benchmark of Animal Sounds☆101Updated 6 months ago
- Audiogen Codec☆135Updated 10 months ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆96Updated 9 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆131Updated 5 months ago
- ☆85Updated 2 weeks ago
- Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complet…☆34Updated 2 months ago
- ☆43Updated 11 months ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆49Updated last month
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆64Updated 4 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆45Updated 7 months ago
- A simple library for Fréchet Audio Distance (FAD) calculation☆205Updated 2 weeks ago
- Speech Human Evaluation Estimation Toolkit (SHEET)☆68Updated 6 months ago
- ☆26Updated 5 months ago
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆51Updated last year
- This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.☆29Updated last month
- ☆83Updated last year
- Baseline multi-resolution cross network model trained using the Divide and Remaster Dataset☆81Updated last year
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆126Updated 5 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆87Updated 4 months ago
- The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tu…☆83Updated 8 months ago
- million song dataset split for extended clean tag & artist-level stratified☆49Updated last year
- Unofficial download repository for MusicCaps☆47Updated 2 years ago
- Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.☆94Updated 8 months ago
- ☆64Updated 3 weeks ago
- SelfRemaster: SSL Speech Restoration☆88Updated last year
- ☆16Updated last week
- A collection of useful audio datasets and transforms for PyTorch.☆139Updated 2 years ago
- SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)☆71Updated 3 months ago
- Expressive Anechoic Recordings of Speech (EARS)☆165Updated 10 months ago