SarthakYadav / fsd50k-pytorchView external linksLinks
Unofficial implementation of FSD50k baselines for Sound Event Recognition
☆26Apr 27, 2024Updated last year
Alternatives and similar repositories for fsd50k-pytorch
Users that are interested in fsd50k-pytorch are comparing it to the libraries listed below
Sorting:
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- The code used to create the ARCA23K and ARCA23K-FSD datasets☆14Nov 9, 2021Updated 4 years ago
- Pytorch implementation of [Learning to match transient sound events using attentional similarity for few-shot sound recognition]☆33Feb 27, 2019Updated 6 years ago
- Source code for training models and using the hyperbolic interface proposed in our ICASSP 2023 paper, “Hyperbolic Audio Source Separation…☆69Apr 27, 2023Updated 2 years ago
- Mesostructures: Beyond Spectrogram Loss in Differentiable Time-Frequency Analysis (Meso-DTFA)☆21Jul 6, 2023Updated 2 years ago
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆22Dec 17, 2025Updated 2 months ago
- Polyphonic generalisation of DDSP☆22Apr 30, 2024Updated last year
- A PyTorch implementation of Conv-TasNet☆46Nov 25, 2019Updated 6 years ago
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Mar 3, 2023Updated 2 years ago
- Comparison of Python audio resampling implementations☆54Jun 30, 2021Updated 4 years ago
- Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24☆32Sep 26, 2024Updated last year
- An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"☆31May 31, 2023Updated 2 years ago
- Speech command recognition with capsule network & various NNs / KWS on Google Speech Command Dataset.☆25Jan 28, 2019Updated 7 years ago
- ☆60Jul 2, 2024Updated last year
- This SDK allows web-based apps/pages to interact with dictation devices☆16Feb 6, 2026Updated last week
- Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)☆43May 24, 2022Updated 3 years ago
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 4 years ago
- This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022☆145Oct 11, 2023Updated 2 years ago
- Supporting code for instrumentation courses at Universidade Nova de Lisboa - Faculdade de Ciência de Lisboa☆16Oct 7, 2022Updated 3 years ago
- Speech Emotion Recognition using Deep Learning☆12May 24, 2021Updated 4 years ago
- A lightweight library to read/write wave audio files to/from lists of native Python types.☆12Jun 10, 2024Updated last year
- BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation☆227Apr 26, 2023Updated 2 years ago
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆34May 25, 2024Updated last year
- ☆10Mar 5, 2024Updated last year
- Simulation of some common communication system structures☆11Feb 27, 2023Updated 2 years ago
- ☆15Apr 4, 2023Updated 2 years ago
- Debiasing Through Data Attribution☆12May 23, 2024Updated last year
- ☆12Jun 17, 2019Updated 6 years ago
- The projects and materials that accompany the Face Tracking with RealityKit course☆13Feb 2, 2021Updated 5 years ago
- Tool for slot extraction from text☆15Oct 23, 2022Updated 3 years ago
- Interface for Controllable Expressive Talking Machine☆40Sep 20, 2025Updated 4 months ago
- Official repo for "A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT" to appear in ICASSP 2021☆44Oct 14, 2021Updated 4 years ago
- A Diffusion Probabilistic Model for Target Sound Extraction☆40Sep 27, 2024Updated last year
- Visually-Aware Audio Captioning☆43Mar 3, 2023Updated 2 years ago
- Code for "CL4AC: A Contrastive Loss for Audio Captioning", DCASE Workshop 2021.☆45Oct 8, 2021Updated 4 years ago
- JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks☆46May 24, 2025Updated 8 months ago
- Official repository: Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrain…☆43Jul 19, 2023Updated 2 years ago
- Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"☆50Nov 10, 2022Updated 3 years ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago