☆32Apr 1, 2023Updated 3 years ago
Alternatives and similar repositories for dcase2023_task7_baseline
Users that are interested in dcase2023_task7_baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Sep 20, 2023Updated 2 years ago
- RWCP-SSD-Onomatopoeia☆23Jun 28, 2023Updated 2 years ago
- Code and generated sounds for "Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning", MLSP 2021☆69Sep 3, 2021Updated 4 years ago
- Audio captioning baseline system for DCASE 2020 challenge.☆38Aug 22, 2023Updated 2 years ago
- ☆12Jun 9, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆51Jun 14, 2022Updated 3 years ago
- Domestic environment sound event detection task☆155Jun 11, 2024Updated last year
- acnn for text-independent speaker recognition☆10Feb 8, 2022Updated 4 years ago
- AudioLDM text to audio colab☆19Nov 6, 2023Updated 2 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- (SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition☆13Oct 22, 2024Updated last year
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆34May 25, 2024Updated last year
- MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.☆44Dec 3, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)☆371Jul 12, 2024Updated last year
- A fast, clean, responsive Hugo theme, now for academics.☆10Jun 20, 2025Updated 9 months ago
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆45Mar 25, 2024Updated 2 years ago
- Implementation of Sheffield entry for Clarity enhancement challenge.☆18Apr 19, 2022Updated 3 years ago
- ☆15Mar 30, 2020Updated 6 years ago
- Deep neural network for audio super-resolution tasks☆15Sep 6, 2020Updated 5 years ago
- This repo provides the network code and the processed samples of the manuscript "Glance and Gaze: A Collaborative Learning Framework for …☆72Feb 10, 2022Updated 4 years ago
- Localization package using distance and/or angle measurements☆16Mar 11, 2022Updated 4 years ago
- OpenFLAM: Framewise Language Audio Model☆101Jan 14, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Experiments from the paper "Sinusoidal Frequency Estimation by Gradient Descent"☆61Mar 8, 2023Updated 3 years ago
- A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, with GPU.☆16Feb 9, 2025Updated last year
- This repo gives the code for the official implementation of RCT.☆14Jun 28, 2022Updated 3 years ago
- This repository contains the trained models and some audio samples for the tPLCnet.☆29Sep 26, 2023Updated 2 years ago
- Splits for epic-sounds dataset☆86Aug 2, 2025Updated 8 months ago
- ☆14Jun 6, 2023Updated 2 years ago
- A PyTorch implementation of the Modified Discrete Cosine Transform (MDCT) and its inverse for audio processing.☆32Dec 17, 2024Updated last year
- Reproduction of "Scyclone" with PyTorch☆16Jan 6, 2021Updated 5 years ago
- The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"☆366Aug 3, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- We design a spectral compression mapping (SCM) for full-band speech enhancement, and propose a two-stage stream named MHA-DPCRN☆24Jul 4, 2022Updated 3 years ago
- How to create a pick up and interaction system in Unity 3d☆12May 5, 2022Updated 3 years ago
- spectrogram inversion tools in PyTorch. Documentation: https://spectrogram-inversion.readthedocs.io☆51Jun 12, 2025Updated 10 months ago
- ☆117Mar 24, 2026Updated 3 weeks ago
- An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"☆22Jul 5, 2023Updated 2 years ago
- Unofficial download repository for MusicCaps☆47Apr 21, 2023Updated 2 years ago
- baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift☆18Mar 16, 2024Updated 2 years ago