☆10Apr 12, 2023Updated 2 years ago
Alternatives and similar repositories for AFFIA3K
Users that are interested in AFFIA3K are comparing it to the libraries listed below
Sorting:
- Code and data recipes for the paper: Heterogeneous Target Speech Separation☆43Dec 6, 2022Updated 3 years ago
- Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"☆50Nov 10, 2022Updated 3 years ago
- Learning differentiable temporal resolution on time-series data.☆37Nov 12, 2022Updated 3 years ago
- The audio-visual fusion method for FFIA☆29Aug 5, 2024Updated last year
- ☆26Oct 13, 2023Updated 2 years ago
- The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.☆31Jan 29, 2024Updated 2 years ago
- Official Pytorch Implementation for Continual Learning For On-Device Environmental Sound Classification☆14Jul 19, 2022Updated 3 years ago
- ☆19Sep 2, 2022Updated 3 years ago
- Code and generated sounds for "Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning", MLSP 2021☆69Sep 3, 2021Updated 4 years ago
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Mar 3, 2023Updated 3 years ago
- Pytorch implementation of subband decomposition☆92Jul 26, 2022Updated 3 years ago
- Audio Captioning datasets for PyTorch.☆127Jul 18, 2025Updated 8 months ago
- Visually-Aware Audio Captioning☆43Mar 3, 2023Updated 3 years ago
- Implementation of our paper 'On Metric Learning For Audio-Text Cross-Modal Retrieval'☆51May 17, 2022Updated 3 years ago
- Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".☆54Jul 16, 2025Updated 8 months ago
- Code for "CL4AC: A Contrastive Loss for Audio Captioning", DCASE Workshop 2021.☆45Oct 8, 2021Updated 4 years ago
- This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022☆145Oct 11, 2023Updated 2 years ago
- ☆21Jul 11, 2019Updated 6 years ago
- General purpose sound recognition demo☆161Oct 3, 2023Updated 2 years ago
- Contrastive Language-Audio Pretraining☆87Mar 6, 2022Updated 4 years ago
- The dataset and baseline code for Text-to-Audio Grounding (TAG)☆50Oct 23, 2025Updated 4 months ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Feb 7, 2022Updated 4 years ago
- VGGSound: A Large-scale Audio-Visual Dataset☆354Sep 13, 2021Updated 4 years ago
- This toolbox aims to unify audio generation model evaluation for easier comparison.☆378Sep 29, 2024Updated last year
- An STFT/iSTFT for PyTorch.☆369Oct 31, 2023Updated 2 years ago
- An Arduino library for interfacing with an ACAM TDC-GP22 over SPI (For Arduino Due)☆17Jun 15, 2020Updated 5 years ago
- ☆15Oct 3, 2023Updated 2 years ago
- Official implementation for CIGN☆17Sep 11, 2023Updated 2 years ago
- Datasets of fish for deep learning.☆20Aug 15, 2024Updated last year
- This task is based on MUSIC-AVQA Dataset. And we focus on optimize the accuracy of AVQA task, which aims to answer questions regarding di…☆13Feb 11, 2023Updated 3 years ago
- ☆509Jun 25, 2024Updated last year
- A PyTorch implementation of the paper: "AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries" (ACM Multimedia 2021…☆21Jul 4, 2021Updated 4 years ago
- Optimizing Efficiency and Visual-Textual Alignment for LLM-Based Radiology Report Generation☆19Mar 5, 2025Updated last year
- Analyzing and Enhancing Visual Learning in LLM-based Radiology Report Generation☆17Feb 23, 2026Updated 3 weeks ago
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆17Nov 19, 2024Updated last year
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆653Apr 5, 2024Updated last year
- 学生成绩管理系统,大二写的数据结构课程设计,用单向链表实现。编码格式是GB2312,目前实现功能:初始化、插入、修改、删除、排序、显示补考名单、显示优秀学生、输出、析构。后续也许会增加文件功能,补充流程图☆15Dec 5, 2020Updated 5 years ago
- Enhancing Radiology Report Generation via Multi-Phased Supervision☆24Mar 6, 2025Updated last year
- 基于yolo3的鱼和人脸的目标检测☆16Sep 10, 2019Updated 6 years ago