Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
☆1,458May 21, 2023Updated 3 years ago
Alternatives and similar repositories for ast
Users that are interested in ast are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".☆423Aug 14, 2022Updated 3 years ago
- Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".☆150Jul 13, 2023Updated 2 years ago
- The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"☆495Sep 18, 2025Updated 8 months ago
- Efficient Training of Audio Transformers with Patchout☆382Jan 12, 2024Updated 2 years ago
- ☆1,739Jul 25, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆664Apr 5, 2024Updated 2 years ago
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".☆472Apr 24, 2024Updated 2 years ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆291Mar 20, 2024Updated 2 years ago
- ESC-50: Dataset for Environmental Sound Classification☆1,822Mar 20, 2024Updated 2 years ago
- A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.☆2,275Apr 13, 2026Updated last month
- Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.☆1,148Nov 24, 2025Updated 6 months ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit☆2,555Mar 12, 2026Updated 2 months ago
- Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)☆868Sep 30, 2021Updated 4 years ago
- LEAF is a learnable alternative to audio features such as mel-filterbanks, that can be initialized as an approximation of mel-filterbanks…☆526Mar 1, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Contrastive Language-Audio Pretraining☆2,157May 15, 2025Updated last year
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆345Nov 20, 2024Updated last year
- Learning audio concepts from natural language supervision☆663Sep 18, 2024Updated last year
- BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation☆234Apr 26, 2023Updated 3 years ago
- Official PyTorch implementation of Contrastive Learning of Musical Representations☆336Jul 25, 2024Updated last year
- VGGSound: A Large-scale Audio-Visual Dataset☆357Sep 13, 2021Updated 4 years ago