oncescuandreea/audio-retrieval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/oncescuandreea/audio-retrieval)

oncescuandreea / audio-retrieval

Implementation of "Audio Retrieval with Natural Language Queries", INTERSPEECH 2021, PyTorch

☆26

Alternatives and similar repositories for audio-retrieval

Users that are interested in audio-retrieval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

akoepke / audio-retrieval-benchmark
View on GitHub
Code for "Audio Retrieval with Natural Language Queries: A Benchmark Study", Transactions on Multimedia 2022
☆54Jul 16, 2025Updated last year
microsoft / WavText5K
View on GitHub
Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"
☆50Nov 10, 2022Updated 3 years ago
oncescuandreea / QuerYD_downloader
View on GitHub
☆23Dec 5, 2023Updated 2 years ago
mugen-org / MUGEN_baseline
View on GitHub
multimodal video-audio-text generation and retrieval between every pair of modalities on the MUGEN dataset. The repo. contains the traini…
☆42Apr 1, 2023Updated 3 years ago
Infinity-INF / fast-phasr
View on GitHub
Phonemes and durations labeling based on whisper small
☆11Jul 7, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
anton-kashkin / hifi_vc
View on GitHub
☆25Jan 24, 2023Updated 3 years ago
filtir / awesome-AI-fact-checking
View on GitHub
A collection of papers tackling automatic fact-checking (particularly of AI-generated content)
☆13Nov 3, 2023Updated 2 years ago
ExplainableML / ImageFreeZSL
View on GitHub
☆18Oct 5, 2024Updated last year
Bizilizi / VGGSounder
View on GitHub
VGGSounder, a multi-label audio-visual classification dataset with modality annotations.
☆17Jun 30, 2026Updated 3 weeks ago
Healbadbad / curveball-pytorch
View on GitHub
An Implementation of "Small steps and giant leaps: Minimal Newton solvers for Deep Learning" In pytorch
☆21Jul 16, 2018Updated 8 years ago
ExplainableML / TCAF-GZSL
View on GitHub
This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"
☆25Sep 12, 2025Updated 10 months ago
basilevh / dissecting-image-crops
View on GitHub
When can you tell whether an image has been cropped or not?
☆29Sep 19, 2021Updated 4 years ago
karchkha / MelSpec_GPT_VQVAE
View on GitHub
Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms
☆18Oct 8, 2023Updated 2 years ago
Top34051 / stargan-zsvc
View on GitHub
Unofficial PyTorch Implementation of StarGAN-ZSVC
☆14Aug 5, 2021Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
CODEJIN / XiaoiceSing2
View on GitHub
☆19Feb 2, 2023Updated 3 years ago
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
jonathan-roberts1 / SciFIBench
View on GitHub
NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
☆13May 24, 2025Updated last year
EricLee8 / BiDeN
View on GitHub
The official code of our paper at EMNLP 2022: Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Mo…
☆16Feb 17, 2023Updated 3 years ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
JiaRenChang / Batch_Normalized_Maxout_NIN
View on GitHub
http://arxiv.org/abs/1511.02583
☆11Nov 5, 2017Updated 8 years ago
BayesWatch / pytorch-blockswap
View on GitHub
Code for BlockSwap (ICLR 2020).
☆33Mar 25, 2021Updated 5 years ago
tqbl / ood_audio
View on GitHub
An audio classification system for learning with out-of-distribution data
☆33Dec 8, 2022Updated 3 years ago
ilaria-manco / mulap
View on GitHub
Official implementation of "Learning Music Audio Representations Via Weak Language Supervision" (ICASSP 2022)
☆47Dec 3, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
XinhaoMei / WavCaps
View on GitHub
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆264Jul 25, 2024Updated 2 years ago
willprice / play-fair
View on GitHub
Shapley values for assessing the importance of each frame in a video
☆17Mar 1, 2021Updated 5 years ago
rrkarim / unbounded-cache-lm
View on GitHub
Unbounded cache model for online language modeling with open vocabulary
☆11Feb 15, 2019Updated 7 years ago
liuxubo717 / cl4ac
View on GitHub
Code for "CL4AC: A Contrastive Loss for Audio Captioning", DCASE Workshop 2021.
☆45Oct 8, 2021Updated 4 years ago
ldzhangyx / BART-fusion
View on GitHub
The code repository for our paper "Interpreting Song Lyrics with a Music-Informed Pre-trained Language Model".
☆24Dec 12, 2022Updated 3 years ago
audio-captioning / dcase-2020-baseline
View on GitHub
Audio captioning baseline system for DCASE 2020 challenge.
☆38Aug 22, 2023Updated 2 years ago
EricLee8 / SPACE
View on GitHub
The official codes for our paper at COLING 2022: Semantic-Preserving Adversarial Code Comprehension
☆12Oct 23, 2022Updated 3 years ago
RicherMans / AudioCaption
View on GitHub
Dataset and baseline for the first Audiocaption task
☆79Jul 25, 2024Updated 2 years ago
ilaria-manco / muscall
View on GitHub
Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)
☆122Dec 5, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
grtzsohalf / SpeechNet-codebase
View on GitHub
☆21Jun 1, 2021Updated 5 years ago
hendriks73 / directional_cnns
View on GitHub
Source code repository for the SMC paper "Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters".
☆33Mar 24, 2023Updated 3 years ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated 2 years ago
hertz-pj / dinglingling
View on GitHub
dinglingling, your program over!
☆18Mar 27, 2020Updated 6 years ago
avi33 / universalmelgan
View on GitHub
This is an unofficial implementation of universal melgan according to https://arxiv.org/abs/2011.09631
☆23Aug 15, 2022Updated 3 years ago
RicherMans / HEAR2021_EfficientLatent
View on GitHub
Submission to the HEAR2021 Challenge
☆17Mar 5, 2022Updated 4 years ago
qbxlvnf11 / MultiWOZ2.1-parser
View on GitHub
MultiWOZ2.1-Parser for Dialogue State Tracking
☆13Aug 3, 2021Updated 4 years ago