JishengBai/ICME2024ASC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JishengBai/ICME2024ASC)

JishengBai / ICME2024ASC

baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift

☆18

Alternatives and similar repositories for ICME2024ASC

Users that are interested in ICME2024ASC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

swagshaw / WildDESED
View on GitHub
WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection
☆18Nov 19, 2024Updated last year
RicherMans / CED
View on GitHub
Source code for Consistent ensemble distillation for audio tagging
☆75Mar 20, 2026Updated 4 months ago
apple-yinhan / Noise-robust-SED
View on GitHub
☆14Jan 2, 2025Updated last year
Audio-AGI / dcase2024_task9_baseline
View on GitHub
Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
nttcslab / dcase2023_task2_evaluator
View on GitHub
☆12Aug 10, 2023Updated 2 years ago
Audio-WestlakeU / audiossl
View on GitHub
A library built for easier audio self-supervised training, downstream tasks evaluation
☆140Sep 25, 2025Updated 10 months ago
DCASE2024-Task7-Sound-Scene-Synthesis / AudioLDM-training-finetuning
View on GitHub
AudioLDM training, finetuning, evaluation and inference.
☆14Mar 27, 2024Updated 2 years ago
Ming-er / LGC-SED
View on GitHub
☆13Jan 3, 2024Updated 2 years ago
WangHelin1997 / GL-AT
View on GitHub
Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.
☆13Feb 6, 2021Updated 5 years ago
qiuqiangkong / mini_llm
View on GitHub
☆29Jul 4, 2025Updated last year
ta012 / SSLAM
View on GitHub
[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
☆79Oct 8, 2025Updated 9 months ago
apple-yinhan / TQ-SED
View on GitHub
☆24Mar 19, 2025Updated last year
michaelneri / unsupervised-audio-anomaly-detection
View on GitHub
Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …
☆11Nov 6, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
cwx-worst-one / EAT
View on GitHub
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
☆239Nov 30, 2025Updated 7 months ago
spkgyk / TDFNet
View on GitHub
Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023
☆14Mar 17, 2024Updated 2 years ago
Harper812 / FFDConv
View on GitHub
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection
☆27May 13, 2026Updated 2 months ago
maum-ai / sane-tts
View on GitHub
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Jun 30, 2023Updated 3 years ago
deepakbaby / isegan
View on GitHub
Improved Speech Enhancement GANs
☆13Jun 24, 2020Updated 6 years ago
fschmid56 / cpjku_dcase23
View on GitHub
This repository contains the code of the CP JKU submission to DCASE23 Task 1 "Low-complexity Acoustic Scene Classification"
☆32Sep 18, 2023Updated 2 years ago
SVDDChallenge / CtrSVDD_Utils
View on GitHub
☆18Jan 10, 2024Updated 2 years ago
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
OzerCanDevecioglu / Exploring-Sound-vs-Vibration-for-Robust-Fault-Detection-on-Rotating-Machinery
View on GitHub
☆13Jul 4, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago
liuxubo717 / SimPFs
View on GitHub
Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023
☆57Mar 3, 2023Updated 3 years ago
C-Fun / Self-Attentive-Pooling-for-Efficient-Deep-Learning
View on GitHub
Official PyTorch implementation of the paper entitled 'Self Attentive Pooling for Efficient Deep Learning'.
☆13May 3, 2024Updated 2 years ago
ictnlp / NAST-S2x
View on GitHub
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆78Oct 22, 2024Updated last year
juhayna-zh / Awesome-Music-Generation-Papers
View on GitHub
Curated list of groundbreaking music generation research.
☆21Apr 24, 2026Updated 3 months ago
zexupan / avse_hybrid_loss
View on GitHub
☆16Jun 15, 2022Updated 4 years ago
xiaomi-research / dasheng-denoiser
View on GitHub
Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…
☆81Jun 16, 2025Updated last year
marmoi / dcase2023_task4b_baseline
View on GitHub
Baseline code for DCASE 2023 task 4 B
☆15Apr 21, 2023Updated 3 years ago
RetroCirce / Zero_Shot_Audio_Source_Separation
View on GitHub
The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022
☆213Jul 14, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xcmyz / ConvTasNet4BasisMelGAN
View on GitHub
This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.
☆21Jul 21, 2021Updated 5 years ago
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
johnmartinsson / differentiable-mel-spectrogram
View on GitHub
The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …
☆24Dec 21, 2024Updated last year
hmartelb / avlit
View on GitHub
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…
☆20Sep 1, 2023Updated 2 years ago
ms-dot-k / TMT
View on GitHub
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
☆18May 23, 2024Updated 2 years ago
LAION-AI / Text-to-speech
View on GitHub
☆61Nov 4, 2023Updated 2 years ago
frednam93 / FilterAugSED
View on GitHub
☆68Sep 13, 2024Updated last year