umbertocappellazzo/PETL_AST

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/umbertocappellazzo/PETL_AST)

umbertocappellazzo / PETL_AST

This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" [IEEE MLSP 2024] and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters" [Interspeech 2024].

☆41

Alternatives and similar repositories for PETL_AST

Users that are interested in PETL_AST are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
965694547 / Hybrid-system-of-frame-wise-model-and-SEDT
View on GitHub
☆28Mar 14, 2023Updated 3 years ago
JinhuaLiang / LaD-ProtoNet
View on GitHub
☆16Sep 14, 2023Updated 2 years ago
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
ta012 / DTFAT
View on GitHub
[AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
☆12Mar 10, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Sara-Ahmed / ASiT
View on GitHub
ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation
☆30Mar 10, 2024Updated 2 years ago
sungnyun / cav2vec
View on GitHub
(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
☆16Apr 29, 2025Updated last year
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
sinhat98 / adapter-wavlm
View on GitHub
☆46Feb 16, 2023Updated 3 years ago
Advanced-Vision-and-Learning-Lab / HLTDNN
View on GitHub
Histogram Layer Time Delay Neural Networks For Passive Sonar Classification
☆19Jan 21, 2026Updated 6 months ago
kaen2891 / stethoscope-guided_supervised_contrastive_learning
View on GitHub
(ICASSP 2024) Official Implementation of "Stethoscope-guided Supervised Contrastive Learning for Cross-domin Adaptation on Respiratory So…
☆18Dec 5, 2024Updated last year
SolomidHero / speech-regeneration-enhancer
View on GitHub
Pytorch implementation of paper "High Fidelity Speech Regeneration With Application to Speech Enhancement"
☆15May 8, 2021Updated 5 years ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
pengzhendong / pyvisqol
View on GitHub
Python Wrapper of visqol
☆11Dec 23, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
raymin0223 / patch-mix_contrastive_learning
View on GitHub
Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)
☆76Mar 11, 2025Updated last year
zhengmidon / singaligner
View on GitHub
a compact audio-to-phoneme aligner for singing voice
☆12Jan 17, 2024Updated 2 years ago
Audio-WestlakeU / audiossl
View on GitHub
A library built for easier audio self-supervised training, downstream tasks evaluation
☆140Sep 25, 2025Updated 10 months ago
SiavashShams / ssamba
View on GitHub
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
☆140Nov 5, 2025Updated 8 months ago
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
SarthakYadav / audio-mamba-official
View on GitHub
Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"
☆44Aug 14, 2025Updated 11 months ago
frednam93 / FDY-SED
View on GitHub
☆96Jun 22, 2023Updated 3 years ago
Sreyan88 / LAPE
View on GitHub
A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)
☆29Jul 9, 2024Updated 2 years ago
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
midas-research / speechmix
View on GitHub
☆12Oct 2, 2020Updated 5 years ago
NikolaiKyhne / RWSAMamba-UNet
View on GitHub
Official repository for the paper "Exploring Resolution-Wise Shared Attention in Hybrid Mamba-U-Nets for Improved Cross-Corpus Speech Enh…
☆19May 5, 2026Updated 2 months ago
JeongHun0716 / MMS-LLaMA
View on GitHub
Official PyTorch implementation for "MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens…
☆48Jun 12, 2025Updated last year
AlbertoAncilotto / NeSsi
View on GitHub
Keras/Pytorch neural network size, operations and parameters counter
☆16Mar 23, 2023Updated 3 years ago
nttcslab / eval-audio-repr
View on GitHub
EVAR ~ Evaluation package for Audio Representations
☆81Feb 19, 2026Updated 5 months ago
roger-tseng / CodecFake
View on GitHub
A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024
☆22Jul 27, 2024Updated last year
umbertocappellazzo / Omni-AVSR
View on GitHub
Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…
☆38Mar 10, 2026Updated 4 months ago
aixplain / NoRefER
View on GitHub
☆18Jun 5, 2026Updated last month
JuanFMontesinos / Acappella-YNet
View on GitHub
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21
☆18May 14, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
glory20h / FitHuBERT
View on GitHub
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning (INTERSPEECH 2022)
☆19Nov 15, 2023Updated 2 years ago
j-bernardi / psds_eval
View on GitHub
Polyphonic Sound Detection Score (PSDS)
☆20Jan 20, 2020Updated 6 years ago
yqx7150 / HKGM
View on GitHub
One-shot Generative Prior in Hankel-k-space for Parallel Imaging Reconstruction
☆12Dec 4, 2024Updated last year
ga642381 / Speech-Prompts-Adapters
View on GitHub
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
☆113Aug 4, 2023Updated 2 years ago
gaowanlu / electronic-design-competition
View on GitHub
NUEDC 2021 G by OpenMV4
☆14Nov 19, 2021Updated 4 years ago
JeongHun0716 / zero-avsr
View on GitHub
Official PyTorch implementation for "Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech …
☆37May 11, 2025Updated last year
yangdongchao / Tim-TSENet
View on GitHub
The source code of Tim-TSENet
☆15Apr 22, 2022Updated 4 years ago