WangHelin1997/MaskSpec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WangHelin1997/MaskSpec)

WangHelin1997 / MaskSpec

The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training

☆51

Alternatives and similar repositories for MaskSpec

Users that are interested in MaskSpec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nttcslab / msm-mae
View on GitHub
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations
☆99Feb 20, 2026Updated 5 months ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
facebookresearch / AudioMAE
View on GitHub
This repo hosts the code and models of "Masked Autoencoders that Listen".
☆672Apr 5, 2024Updated 2 years ago
midas-research / speechmix
View on GitHub
☆12Oct 2, 2020Updated 5 years ago
ta012 / DTFAT
View on GitHub
[AAAI 2024] DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
☆12Mar 10, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
YuanGongND / ssast
View on GitHub
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
☆426Aug 14, 2022Updated 3 years ago
edufonseca / uclser20
View on GitHub
Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.
☆93Dec 22, 2022Updated 3 years ago
nttcslab / eval-audio-repr
View on GitHub
EVAR ~ Evaluation package for Audio Representations
☆81Feb 19, 2026Updated 5 months ago
WangHelin1997 / DCASE-2020-Task1A-Code
View on GitHub
A pytorch implementation of the paper : Acoustic Scene Classification with Multiple Decision Schemes.
☆20Dec 12, 2020Updated 5 years ago
GasserElbanna / serab-byols
View on GitHub
(Hybrid) BYOL-S feature extractor using serab-byols package in pytorch.
☆27Apr 20, 2024Updated 2 years ago
ben-hayes / timbre-dissimilarity-metrics
View on GitHub
A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API
☆32Dec 30, 2021Updated 4 years ago
WangHelin1997 / AT-GCN
View on GitHub
Pytorch implementation of the paper : Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network
☆14Sep 18, 2020Updated 5 years ago
SiddGururani / mixing_secrets
View on GitHub
☆26Mar 5, 2018Updated 8 years ago
YuanGongND / psla
View on GitHub
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
☆150Jul 13, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
haidog-yaqub / DPMTSE
View on GitHub
A Diffusion Probabilistic Model for Target Sound Extraction
☆40Sep 27, 2024Updated last year
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
vskadandale / instrument-recognition-polyphonic
View on GitHub
Implementations for master thesis "Musical Instrument Recognition in Multi-Instrument Audio Contexts" with MedleyDB.
☆16Apr 4, 2019Updated 7 years ago
aispeech-lab / WASE
View on GitHub
PyTorch implementation of WASE described in our ICASSP 2021: "Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Envi…
☆27Jan 11, 2022Updated 4 years ago
kaen2891 / adversarial_fine-tuning_using_generated_respiratory_sound
View on GitHub
(NeurIPS 2023 Workshop on DGM4H) Official Implementation of "Adversarial Fine-tuning using Generated Respiratory Sound to Address Class I…
☆19Dec 5, 2024Updated last year
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
huaidanquede / Dense-TSNet
View on GitHub
offical code for Dense-TSNet
☆12Sep 17, 2024Updated last year
yangdongchao / DCASE2021Task5
View on GitHub
The code for DCASE2021 task5 submission.
☆20Feb 21, 2022Updated 4 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
thuhcsi / IJCAI2019-DRL4SER
View on GitHub
The python implementation for paper "Towards Discriminative Representation Learning for Speech Emotion Recognition" in IJCAI-2019
☆23Aug 12, 2019Updated 6 years ago
yoyolicoris / spectrogram-inversion
View on GitHub
spectrogram inversion tools in PyTorch. Documentation: https://spectrogram-inversion.readthedocs.io
☆51Jun 12, 2025Updated last year
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
Alexander-H-Liu / dinosr
View on GitHub
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
☆53Jan 18, 2024Updated 2 years ago
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
ssrp / SubSpectralNet
View on GitHub
SubSpectralNet - Using Sub-Spectrogram based Convolutional Neural Networks for Acoustic Scene Classification, accepted in ICASSP 2019
☆18Feb 20, 2019Updated 7 years ago
YuanGongND / uavm
View on GitHub
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
☆57Apr 20, 2023Updated 3 years ago
chorowski-lab / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆10Feb 22, 2022Updated 4 years ago
ilyassmoummad / scl_icbhi2017
View on GitHub
PyTorch implementation of our work: Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning (WASPAA 2023)
☆33Feb 4, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
WangHelin1997 / GL-AT
View on GitHub
Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.
☆13Feb 6, 2021Updated 5 years ago
nttcslab / byol-a
View on GitHub
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation
☆237Apr 26, 2023Updated 3 years ago
kaiidams / soundstream-pytorch
View on GitHub
Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint
☆82Feb 9, 2026Updated 5 months ago
swagshaw / ASC-CL
View on GitHub
Official Pytorch Implementation for Continual Learning For On-Device Environmental Sound Classification
☆14Jul 19, 2022Updated 4 years ago
DigitalPhonetics / cyclegan-emotion-transfer
View on GitHub
CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition
☆12Oct 7, 2019Updated 6 years ago
migperfer / AutoMashupper
View on GitHub
Tool to aid in the creation of mashups
☆21Apr 7, 2020Updated 6 years ago
tqbl / ood_audio
View on GitHub
An audio classification system for learning with out-of-distribution data
☆33Dec 8, 2022Updated 3 years ago