JishengBai / ICME2024ASCView external linksLinks
baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift
☆18Mar 16, 2024Updated last year
Alternatives and similar repositories for ICME2024ASC
Users that are interested in ICME2024ASC are comparing it to the libraries listed below
Sorting:
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆17Nov 19, 2024Updated last year
- Source code for Consistent ensemble distillation for audio tagging☆56Jun 12, 2025Updated 8 months ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆26Mar 27, 2024Updated last year
- A library built for easier audio self-supervised training, downstream tasks evaluation☆136Sep 25, 2025Updated 4 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆195Dec 13, 2024Updated last year
- Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023☆12Mar 17, 2024Updated last year
- ☆13Jan 2, 2025Updated last year
- ☆29Jul 4, 2025Updated 7 months ago
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- ☆12Aug 10, 2023Updated 2 years ago
- Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.☆13Feb 6, 2021Updated 5 years ago
- ☆13Jan 3, 2024Updated 2 years ago
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Mar 3, 2023Updated 2 years ago
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆221Nov 30, 2025Updated 2 months ago
- AudioLDM training, finetuning, evaluation and inference.☆14Mar 27, 2024Updated last year
- Official Implementation of our Interspeech 2021 paper "An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure …☆16Feb 15, 2022Updated 4 years ago
- ☆15Jun 15, 2022Updated 3 years ago
- Improved Speech Enhancement GANs☆12Jun 24, 2020Updated 5 years ago
- This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.☆21Jul 21, 2021Updated 4 years ago
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆18May 23, 2024Updated last year
- Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection☆22Aug 22, 2024Updated last year
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022☆210Jul 14, 2022Updated 3 years ago
- This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)☆25Dec 7, 2023Updated 2 years ago
- ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation☆28Mar 10, 2024Updated last year
- ☆31Jun 30, 2023Updated 2 years ago
- ☆61Nov 4, 2023Updated 2 years ago
- A list of current Audio-Vision Multimodal with awesome resources (paper, application, data, review, survey, etc.).☆32Oct 11, 2023Updated 2 years ago
- ☆187Nov 19, 2025Updated 2 months ago
- The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"☆37Mar 11, 2024Updated last year
- This repository contains the code of the CP JKU submission to DCASE23 Task 1 "Low-complexity Acoustic Scene Classification"☆31Sep 18, 2023Updated 2 years ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆286Mar 20, 2024Updated last year
- FI-SLAM: Feature Information-Based Robust and Efficient Vision-Inertial-Aided LiDAR SLAM☆28Jun 4, 2024Updated last year
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆76Oct 22, 2024Updated last year
- ☆32Apr 1, 2023Updated 2 years ago
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆328Nov 20, 2024Updated last year
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆34Feb 21, 2025Updated 11 months ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"☆134Nov 29, 2023Updated 2 years ago