WangHelin1997 / SpecAugment-plus
A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification
☆31Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for SpecAugment-plus
- ☆26Updated last year
- Learning differentiable temporal resolution on time-series data.☆32Updated last year
- experiments about AudioSet☆43Updated last year
- System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection☆27Updated 2 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆69Updated 2 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆38Updated 2 months ago
- ☆36Updated 2 years ago
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆23Updated 2 months ago
- ☆41Updated last year
- Streaming Audiotransformers for online Audio tagging☆41Updated 4 months ago
- The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training☆39Updated last year
- A toolkit dedicate for speech evaluation.☆18Updated last month
- A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK☆61Updated 3 years ago
- ☆29Updated 2 years ago
- ICASSP2025Dynamic Embedding Causal Target Speech Extraction☆28Updated last month
- For students who would like to apply for RA, PhD, postdoc in audio research.☆24Updated 2 weeks ago
- Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"☆30Updated 2 years ago
- ☆18Updated 2 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated last year
- Exploring Binary Classification Loss for Speaker Verification☆14Updated last year
- SRTNet☆24Updated last year
- ☆15Updated 2 years ago
- Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.☆13Updated 3 years ago
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆83Updated 2 years ago
- The source code of Tim-TSENet☆12Updated 2 years ago
- Code and data recipes for the paper: Heterogeneous Target Speech Separation☆39Updated last year
- ☆27Updated last year
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆32Updated last year
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆34Updated 11 months ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆20Updated last month