SAGNIKMJR / few-shot-rir
Code and datasets for 'Few-Shot Audio-Visual Learning of Environment Acoustics' (NeurIPS 2022)
β14Updated last year
Alternatives and similar repositories for few-shot-rir:
Users that are interested in few-shot-rir are comparing it to the libraries listed below
- Repo for Visual Acoustic Matching, CVPR 2022β66Updated last year
- Code for paper Learning Audio-Visual Dereverberationβ26Updated 2 years ago
- π¦ Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)β40Updated last week
- Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmarkβ41Updated 5 months ago
- An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"β30Updated last year
- Project website for "Telling left from right: Learning spatial correspondence between sight and sound"β22Updated 2 years ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Moβ¦β19Updated last year
- β37Updated 2 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotationβ36Updated last year
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"β15Updated 10 months ago
- SRTNetβ24Updated last year
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervβ¦β36Updated last year
- [Neurips'24 Spotlight] Official code for "Acoustic Volume Rendering for Neural Impulse Response Fields"β28Updated last month
- Official Implementation of "Inference and Denoise: Causal Inference-based Neural Speech Enhancement"β27Updated last year
- Pytorch implementation of INTEGRATED PARAMETER-EFFICIENT TUNING FOR GENERAL-PURPOSE AUDIO MODELSβ10Updated last year
- Source code for the paper 'Audio Captioning Transformer'β53Updated 3 years ago
- Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)β16Updated 2 years ago
- β9Updated 8 months ago
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformerβ86Updated 2 years ago
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, acβ¦β28Updated 8 months ago
- Pytorch implementation for βV2C: Visual Voice Cloningββ30Updated 2 years ago
- Audio propagation engine - Meta Reality Labs Research.β18Updated 2 years ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"β24Updated 10 months ago
- β80Updated last year
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)β9Updated last year
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Modelsβ14Updated 7 months ago
- COLA contrastive pre-training method implemented in PyTorchβ42Updated 4 years ago
- A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)β51Updated 10 months ago
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024β19Updated last month
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Pβ¦β35Updated last year