Labbeti / aac-datasets
Audio Captioning datasets for PyTorch.
☆113Updated 3 months ago
Alternatives and similar repositories for aac-datasets:
Users that are interested in aac-datasets are comparing it to the libraries listed below
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆116Updated 2 months ago
- Versatile Evaluation of Speech and Audio☆156Updated this week
- This package aims at simplifying the download of the AudioCaps dataset.☆31Updated last year
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆168Updated 7 months ago
- Reference-aware automatic speech evaluation toolkit☆142Updated 2 months ago
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆218Updated 6 months ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆107Updated last year
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Updated last year
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆128Updated 2 years ago
- Audio Codec Speech processing Universal PERformance Benchmark☆238Updated 3 months ago
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆130Updated last month
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆51Updated last week
- Code for CVSSP submission to DCASE 2021 Task 6☆35Updated 2 years ago
- A library built for easier audio self-supervised training, downstream tasks evaluation☆111Updated 5 months ago
- A list of papers about audio captioning☆78Updated 2 years ago
- Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"☆49Updated 2 years ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆124Updated 3 weeks ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆111Updated last year
- This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".☆120Updated 4 months ago
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆75Updated 8 months ago
- The open source code for LLM-Codec☆126Updated 5 months ago
- An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"☆133Updated last year
- UTokyo-SaruLab MOS Prediction System☆144Updated 2 months ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆127Updated 8 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆98Updated last year
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆70Updated last year
- A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)☆51Updated 9 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆107Updated 2 months ago
- Official Implementation of EnCLAP (ICASSP 2024)☆90Updated 8 months ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆148Updated last month