cwx-worst-one / EATLinks
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
☆175Updated last month
Alternatives and similar repositories for EAT
Users that are interested in EAT are comparing it to the libraries listed below
Sorting:
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆70Updated 3 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆175Updated 7 months ago
- Audio Captioning datasets for PyTorch.☆120Updated 3 weeks ago
- This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".☆137Updated last week
- This package aims at simplifying the download of the AudioCaps dataset.☆36Updated last year
- ☆85Updated 2 months ago
- unofficial implementation of the High Fidelity Neural Audio Compression☆160Updated 11 months ago
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆147Updated 2 years ago
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆57Updated 5 months ago
- Audio-FLAN☆157Updated 5 months ago
- Source code for Consistent ensemble distillation for audio tagging☆43Updated last month
- Masked Modeling Duo: Towards a Universal Audio Pre-training Framework☆109Updated last year
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".