Author's repository for reproducing DcaseNet, an integrated pre-trained DNN that performs acoustic scene classification, audio tagging, and sound event detection. Implemented using PyTorch.
☆43Nov 10, 2021Updated 4 years ago
Alternatives and similar repositories for DcaseNet
Users that are interested in DcaseNet are comparing it to the libraries listed below
Sorting:
- Detect specific type of sound in audio signals☆13Jun 20, 2024Updated last year
- Tracking beer/wine using Audio Event Detection with Machine Learning☆15Jun 16, 2024Updated last year
- Baseline method for sound event localization task of DCASE 2021 challenge☆42Jun 15, 2021Updated 4 years ago
- CP-JKU submission to DCASE 20☆45Apr 19, 2021Updated 4 years ago
- Paderborn Sound Event Detection☆78Jul 18, 2023Updated 2 years ago
- ☆18Nov 10, 2019Updated 6 years ago
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆23Updated this week
- Comparing Audio Features for Unsupervised Sound Classification☆10Jun 22, 2022Updated 3 years ago
- ☆10Sep 2, 2024Updated last year
- ☆13Jan 2, 2025Updated last year
- An implementation of the Prism layer (https://arxiv.org/abs/2011.04823)☆12Nov 13, 2020Updated 5 years ago
- kaldi cnn-tdnnf baseline☆13Aug 31, 2021Updated 4 years ago
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆16Feb 1, 2026Updated 3 weeks ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- Openfst mirror with some fixes☆14Aug 23, 2024Updated last year
- Using OpenVINO to speed up MeloTTS inference☆15Nov 1, 2024Updated last year
- Echo aware source separation☆13May 29, 2018Updated 7 years ago
- This repository contains the code of the CP JKU submission to DCASE23 Task 1 "Low-complexity Acoustic Scene Classification"☆30Sep 18, 2023Updated 2 years ago
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated 11 months ago
- Python toolkit for likelihood-ratio calibration of binary classifiers☆27Feb 21, 2023Updated 3 years ago
- DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently☆11Jun 6, 2024Updated last year
- CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning [Official PyTorch implementation]☆22Jun 12, 2025Updated 8 months ago
- DCASE2020 Challenge Task 2 baseline system☆114Dec 27, 2022Updated 3 years ago
- Prediction of sound event bounding boxes (SEBBs)☆32Aug 2, 2024Updated last year
- In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the ceps…☆28Mar 8, 2020Updated 5 years ago
- Multispeaker Community Vocoder Model for DiffSinger☆39Aug 11, 2025Updated 6 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- Visibility graphs for robust harmonic similarity measures between audio spectra☆15Apr 29, 2020Updated 5 years ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆21Jun 7, 2025Updated 8 months ago
- Source code for publication: "Spectrum Correction: Acoustic Scene Classification with Mismatched Recording Devices"☆13Feb 22, 2022Updated 4 years ago
- A Study of Low-Resource Speech Commands Recognition Based on Adversarial Reprogramming☆19Oct 12, 2023Updated 2 years ago
- The Official PyTorch Implementation of "Mel-McNet: A Mel-Scale Framework for Online Multichannel Speech Enhancement" [Interspeech 2025]☆23Jun 9, 2025Updated 8 months ago
- Visualization toolbox for Sound Event Detection☆124Feb 26, 2024Updated 2 years ago
- ☆14Aug 19, 2024Updated last year
- ☆17Oct 18, 2023Updated 2 years ago
- NMT based punctuation prediction system using lexical and acoustic features .☆14Mar 30, 2020Updated 5 years ago
- Code for DCASE 2020 task 1a and task 1b.☆88Jan 20, 2022Updated 4 years ago