haoheliu/DCASE_2022_Task_5

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/haoheliu/DCASE_2022_Task_5)

haoheliu / DCASE_2022_Task_5

System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection

☆28

Alternatives and similar repositories for DCASE_2022_Task_5

Users that are interested in DCASE_2022_Task_5 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yangdongchao / DCASE2021Task5
View on GitHub
The code for DCASE2021 task5 submission.
☆20Feb 21, 2022Updated 4 years ago
michaelneri / unsupervised-audio-anomaly-detection
View on GitHub
Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …
☆11Nov 6, 2024Updated last year
ilyassmoummad / dcase23_task5_scl
View on GitHub
System that ranked 2nd in DCASE 2023 Challenge Task 5: Few-shot Bioacoustic Event Detection
☆12Sep 5, 2024Updated last year
c4dm / dcase-few-shot-bioacoustic
View on GitHub
☆61Jul 2, 2024Updated 2 years ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
qiuqiangkong / sound_event_detection_dcase2017_task4
View on GitHub
☆55Jun 3, 2020Updated 6 years ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
denfed / wave-spec-fusion
View on GitHub
Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…
☆16Aug 9, 2021Updated 4 years ago
Honee-W / CPTNN
View on GitHub
unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"
☆15Nov 14, 2023Updated 2 years ago
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
facebookresearch / learning-audio-visual-dereverberation
View on GitHub
Code for paper Learning Audio-Visual Dereverberation
☆32Aug 10, 2022Updated 3 years ago
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
YuanGongND / psla
View on GitHub
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
☆150Jul 13, 2023Updated 3 years ago
bootphon / learnable-strf
View on GitHub
Learnable STRF, from Riad et al. 2021 JASA
☆13Aug 21, 2021Updated 4 years ago
haoheliu / diffres-python
View on GitHub
Learning differentiable temporal resolution on time-series data.
☆36Nov 12, 2022Updated 3 years ago
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
eloimoliner / bwe_historical_recordings
View on GitHub
Bandwidth Extension of Historical Recordings using Generative Adversarial Networks
☆38May 25, 2023Updated 3 years ago
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
liuxubo717 / SimPFs
View on GitHub
Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023
☆57Mar 3, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Adibian / Persian-MultiSpeaker-Tacotron2
View on GitHub
Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.
☆13Oct 2, 2025Updated 9 months ago
exercise-book-yq / Supercodec
View on GitHub
☆51Mar 5, 2026Updated 4 months ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
ifnspaml / Enhancement-Coded-Speech
View on GitHub
☆24Apr 25, 2022Updated 4 years ago
desh2608 / css
View on GitHub
PyTorch implementation of Continuous Speech Separation
☆12Oct 5, 2022Updated 3 years ago
denfed / leaf-audio-pytorch
View on GitHub
Pytorch port of Google Research's LEAF Audio paper
☆91May 19, 2021Updated 5 years ago
Andong-Li-speech / RTNet
View on GitHub
implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain
☆47Nov 4, 2020Updated 5 years ago
lmaxwell / McHuo
View on GitHub
A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes
☆12Oct 19, 2023Updated 2 years ago
WangHelin1997 / DuTa-VC
View on GitHub
Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…
☆38Dec 5, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
schufo / tisms
View on GitHub
This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"
☆16Apr 8, 2024Updated 2 years ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
frednam93 / FDY-SED
View on GitHub
☆96Jun 22, 2023Updated 3 years ago
chaufanglin / Normal2Whisper
View on GitHub
Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"
☆14Oct 31, 2024Updated last year