drumpt / SGEMLinks
Official PyTorch implementation of SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization (INTERSPEECH 2023 Oral Presentation)
☆37Updated last year
Alternatives and similar repositories for SGEM
Users that are interested in SGEM are comparing it to the libraries listed below
Sorting:
- Test-time adaptation for speech recognition model by single utterance. The official implementation of "Listen, Adapt, Better WER: Source-…☆20Updated 3 years ago
- Details of the datasets for Few-shot class-incremental audio classification☆11Updated 2 years ago
- (ICASSP 2024) Official Implementation of "Stethoscope-guided Supervised Contrastive Learning for Cross-domin Adaptation on Respiratory So…☆17Updated last year
- ☆12Updated 2 years ago
- A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022☆38Updated 2 years ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆15Updated 9 months ago
- Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations☆100Updated last year
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Updated 2 years ago
- A Pytorch implementation of "Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare To…☆10Updated 3 years ago
- Can audio-visual integration strengthen robustness under multimodal attacks?☆29Updated 3 years ago
- ☆76Updated 3 months ago
- Speech2Vec Reality Check☆88Updated 2 years ago
- Pytorch implementation of "Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal T…☆12Updated last year
- This repo contains script to download MUSIC dataset from youtube☆12Updated 2 years ago
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆17Updated last year
- Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh…☆47Updated 2 years ago
- Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)☆20Updated 3 years ago
- [ICASSP 2023] FedAudio: A Federated Learning Benchmark for Audio and Speech Tasks☆51Updated last year
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆78Updated 11 months ago
- Official implementation of Mitigating Sexual Content Generation via Embedding Distortion in Text-conditioned Diffusion Models☆73Updated last month
- [NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…☆17Updated 2 years ago
- ☆21Updated 4 years ago
- Code for ICML2020 paper - CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information☆353Updated last year
- his code is a pytorch version for CycleFlow model in "CycleFlow: Purify Information Factors by Cycle Loss"☆15Updated 4 years ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Updated 2 years ago
- ☆19Updated last year
- Continual Learning Method RWM for AAAI 2024☆22Updated last year
- ☆12Updated 2 years ago
- Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)☆72Updated 10 months ago
- DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023☆59Updated 8 months ago