ta012/SSLAM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ta012/SSLAM)

ta012 / SSLAM

[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

☆79

Alternatives and similar repositories for SSLAM

Users that are interested in SSLAM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Audio-AGI / dcase2024_task9_baseline
View on GitHub
Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
HanxunH / AudioMosaic
View on GitHub
[ICML2026] AudioMosaic: Contrastive Masked Audio Representation Learning
☆22May 15, 2026Updated 2 months ago
cwx-worst-one / EAT
View on GitHub
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
☆237Nov 30, 2025Updated 7 months ago
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 6 months ago
Phuriches / GenRepASD
View on GitHub
Pytorch implementation of Deep Generic Representations for Domain-Generalized Anomalous Sound Detection: https://arxiv.org/abs/2409.05035
☆28Mar 16, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
WWWWxp / Speech-Tokenizer-Papers
View on GitHub
This repository collects papers related to Speech Tokenizer.
☆18Oct 16, 2024Updated last year
zszheng147 / Spatial-AST
View on GitHub
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆87Feb 13, 2025Updated last year
Audio-WestlakeU / audiossl
View on GitHub
A library built for easier audio self-supervised training, downstream tasks evaluation
☆140Sep 25, 2025Updated 9 months ago
yucongzh / ECHO
View on GitHub
ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signal
☆18Jul 3, 2026Updated 2 weeks ago
WangHelin1997 / SoloAudio
View on GitHub
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.
☆119Jan 28, 2026Updated 5 months ago
soham97 / ADIFF
View on GitHub
Explaining audio differences using language
☆16Feb 11, 2025Updated last year
yzyouzhang / Audio_Research_in_US
View on GitHub
Audio Research in US. US-based professors who work on audio (music, speech, acoustics). For students who would like to apply for RA, PhD,…
☆27Feb 27, 2026Updated 4 months ago
Audio-AGI / FlowSep
View on GitHub
Official implementation for FlowSep
☆77Jan 2, 2025Updated last year
lmxue / Audio-FLAN
View on GitHub
Audio-FLAN
☆161Sep 23, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
apple-yinhan / TQ-SED
View on GitHub
☆23Mar 19, 2025Updated last year
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
JishengBai / ICME2024ASC
View on GitHub
baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift
☆18Mar 16, 2024Updated 2 years ago
Liu-Tianchi / Golden-Gemini-for-Speaker-Verification
View on GitHub
Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'
☆15Jan 20, 2025Updated last year
facebookresearch / FlowDec
View on GitHub
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
☆212Jun 22, 2026Updated 3 weeks ago
xiquan-li / FineLAP
View on GitHub
[ACL 2026 Main] FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pre-training
☆36Apr 20, 2026Updated 3 months ago
wsntxxn / UniFlow-Audio
View on GitHub
☆72Updated this week
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
YoonjinXD / kadtk
View on GitHub
A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …
☆104Jun 12, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
Jiang-Yidi / UniCodec
View on GitHub
[ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and s…
☆156May 30, 2025Updated last year
ICDM-UESTC / DOSE
View on GitHub
DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023
☆60May 16, 2025Updated last year
apple-yinhan / EnvSDD
View on GitHub
Official code for EnvSDD (Environmental Sound Deepfake Detection)
☆35May 17, 2026Updated 2 months ago
mileskuo42 / AudioMarkBench
View on GitHub
Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking
☆48Aug 23, 2024Updated last year
FakeSoundData / FakeSound
View on GitHub
☆20Jul 19, 2024Updated 2 years ago
SVDDChallenge / CtrSVDD_Utils
View on GitHub
☆18Jan 10, 2024Updated 2 years ago
xiquan-li / MeanAudio
View on GitHub
[ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
☆142Sep 2, 2025Updated 10 months ago
yuhanghe01 / RiTTA
View on GitHub
Event Relation in Text-to-Audio (TTA) Generation
☆21Feb 26, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
soham97 / mellow
View on GitHub
small audio language model for reasoning
☆88Dec 4, 2025Updated 7 months ago
facebookresearch / WavFlow
View on GitHub
MultiModal Audio Generation in Raw Waveform Space.
☆154May 26, 2026Updated last month
jaeyeonkim99 / visage
View on GitHub
Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)
☆47Sep 10, 2025Updated 10 months ago
NVIDIA / audio-intelligence
View on GitHub
Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…
☆135Mar 3, 2026Updated 4 months ago
bfs18 / armel
View on GitHub
poorman's ar-dit tts
☆45Dec 31, 2025Updated 6 months ago
Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆23Jul 10, 2024Updated 2 years ago
cuichenrui2000 / barry_speech_tools
View on GitHub
This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets re…
☆13Oct 8, 2025Updated 9 months ago