earthspecies / ispa
☆19Updated last week
Alternatives and similar repositories for ispa:
Users that are interested in ispa are comparing it to the libraries listed below
- ☆40Updated 5 months ago
- Stable Audio UnOffical Implementation: Latent Diffusion for Audio Generation☆23Updated 9 months ago
- Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complet…☆32Updated 7 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆37Updated last month
- Speech Human Evaluation Estimation Toolkit (SHEET)☆41Updated last week
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆80Updated 3 months ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆71Updated last year
- EVAR ~ Evaluation package for Audio Representations☆43Updated 2 weeks ago
- ☆46Updated this week
- Collection of scripts from mHuBERT-147.☆22Updated last week
- Unofficial implementation of NANSY++ in Pytorch Lightning☆49Updated 8 months ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated last year
- ☆27Updated last year
- ☆79Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆42Updated 2 months ago
- ☆51Updated last year
- Robust Singing Voice Transcription and MIDI Extraction☆58Updated this week
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆67Updated last week
- AudioSR-Upsampling (any -> 48kHz)☆38Updated 9 months ago
- Reference-aware automatic speech evaluation toolkit☆109Updated 9 months ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆88Updated 4 months ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆50Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 11 months ago
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆34Updated 11 months ago
- EMO-SUPERB submission☆28Updated 2 months ago
- BEANS: The Benchmark of Animal Sounds☆80Updated last month
- ☆50Updated 9 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆84Updated 2 months ago
- ☆27Updated last year