System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection
☆28Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for DCASE_2022_Task_5
Users that are interested in DCASE_2022_Task_5 are comparing it to the libraries listed below
Sorting:
- The code for DCASE2021 task5 submission.☆20Feb 21, 2022Updated 4 years ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆11Nov 6, 2024Updated last year
- System that ranked 2nd in DCASE 2023 Challenge Task 5: Few-shot Bioacoustic Event Detection☆12Sep 5, 2024Updated last year
- ☆60Jul 2, 2024Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 10 months ago
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- ☆30Jan 22, 2026Updated 2 months ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆36Sep 9, 2025Updated 6 months ago
- ☆12Mar 11, 2025Updated last year
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 7 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- ☆54Jun 3, 2020Updated 5 years ago
- Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…☆16Aug 9, 2021Updated 4 years ago
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Nov 14, 2023Updated 2 years ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- Code for paper Learning Audio-Visual Dereverberation☆31Aug 10, 2022Updated 3 years ago
- implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain☆48Nov 4, 2020Updated 5 years ago
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆40Sep 18, 2024Updated last year
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆77Jul 16, 2023Updated 2 years ago
- Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".☆149Jul 13, 2023Updated 2 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Learning differentiable temporal resolution on time-series data.☆37Nov 12, 2022Updated 3 years ago
- ☆12Nov 7, 2024Updated last year
- MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.☆22Apr 8, 2021Updated 4 years ago
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆92Jun 9, 2022Updated 3 years ago
- Bandwidth Extension of Historical Recordings using Generative Adversarial Networks☆35May 25, 2023Updated 2 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Mar 3, 2023Updated 3 years ago
- ☆51Mar 5, 2026Updated 2 weeks ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆90Dec 20, 2024Updated last year
- Learnable STRF, from Riad et al. 2021 JASA☆13Aug 21, 2021Updated 4 years ago
- ☆25Jul 20, 2021Updated 4 years ago
- PyTorch implementation of Continuous Speech Separation☆12Oct 5, 2022Updated 3 years ago
- Pytorch port of Google Research's LEAF Audio paper☆93May 19, 2021Updated 4 years ago
- Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.☆13Oct 2, 2025Updated 5 months ago
- ☆23Apr 25, 2022Updated 3 years ago
- Sisyphus recipies for ASR☆19Mar 16, 2026Updated last week
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆37Dec 5, 2023Updated 2 years ago
- ☆118May 13, 2025Updated 10 months ago