LudovicTuncay / Audio-JEPAView on GitHub
Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Built upon the I-JEPA paradigm, it uses a Vision Transformer (ViT) backbone to predict latent representations of masked spectrogram patches.
42Mar 19, 2026Updated this week

Alternatives and similar repositories for Audio-JEPA

Users that are interested in Audio-JEPA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?