Unofficial implementation of FSD50k baselines for Sound Event Recognition
☆26Apr 27, 2024Updated last year
Alternatives and similar repositories for fsd50k-pytorch
Users that are interested in fsd50k-pytorch are comparing it to the libraries listed below
Sorting:
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- The code used to create the ARCA23K and ARCA23K-FSD datasets☆15Nov 9, 2021Updated 4 years ago
- Pytorch implementation of [Learning to match transient sound events using attentional similarity for few-shot sound recognition]☆33Feb 27, 2019Updated 7 years ago
- Source code for training models and using the hyperbolic interface proposed in our ICASSP 2023 paper, “Hyperbolic Audio Source Separation…☆69Apr 27, 2023Updated 2 years ago
- Mesostructures: Beyond Spectrogram Loss in Differentiable Time-Frequency Analysis (Meso-DTFA)☆21Jul 6, 2023Updated 2 years ago
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆22Dec 17, 2025Updated 2 months ago
- Polyphonic generalisation of DDSP☆22Apr 30, 2024Updated last year
- A PyTorch implementation of Conv-TasNet☆46Nov 25, 2019Updated 6 years ago
- This package aims at simplifying the download of the AudioSet dataset.☆58Jul 17, 2025Updated 7 months ago
- ☆23Aug 2, 2019Updated 6 years ago
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Mar 3, 2023Updated 3 years ago
- Comparison of Python audio resampling implementations☆54Jun 30, 2021Updated 4 years ago
- Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24☆32Sep 26, 2024Updated last year
- An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"☆31May 31, 2023Updated 2 years ago
- ☆35Aug 16, 2024Updated last year
- Speech command recognition with capsule network & various NNs / KWS on Google Speech Command Dataset.☆25Jan 28, 2019Updated 7 years ago
- ☆60Jul 2, 2024Updated last year
- Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)☆43May 24, 2022Updated 3 years ago
- This SDK allows web-based apps/pages to interact with dictation devices☆17Feb 12, 2026Updated 3 weeks ago
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- Baseline multi-resolution cross network model trained using the Divide and Remaster Dataset☆87Jan 25, 2024Updated 2 years ago
- This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022☆145Oct 11, 2023Updated 2 years ago
- Latte: Cross-framework Python Package for Evaluation of Latent-based Generative Models☆37Jul 29, 2025Updated 7 months ago
- Speech Emotion Recognition using Deep Learning☆12May 24, 2021Updated 4 years ago
- A lightweight library to read/write wave audio files to/from lists of native Python types.☆12Jun 10, 2024Updated last year
- Supporting code for instrumentation courses at Universidade Nova de Lisboa - Faculdade de Ciência de Lisboa☆16Oct 7, 2022Updated 3 years ago
- BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation☆229Apr 26, 2023Updated 2 years ago
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆34May 25, 2024Updated last year
- ☆12Jun 17, 2019Updated 6 years ago
- Tool for slot extraction from text☆15Oct 23, 2022Updated 3 years ago
- ☆10Feb 12, 2026Updated 3 weeks ago
- ☆10Mar 5, 2024Updated 2 years ago
- The projects and materials that accompany the Face Tracking with RealityKit course☆13Feb 2, 2021Updated 5 years ago
- Simulation of some common communication system structures☆11Feb 27, 2023Updated 3 years ago
- Implements Global Word Vectors.☆11Feb 8, 2020Updated 6 years ago
- Debiasing Through Data Attribution☆12May 23, 2024Updated last year
- A Diffusion Probabilistic Model for Target Sound Extraction☆40Sep 27, 2024Updated last year
- Interface for Controllable Expressive Talking Machine☆40Sep 20, 2025Updated 5 months ago
- Official repo for "A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT" to appear in ICASSP 2021☆44Oct 14, 2021Updated 4 years ago