JiabenChen / iQuery
[CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation
☆62Updated last year
Alternatives and similar repositories for iQuery:
Users that are interested in iQuery are comparing it to the libraries listed below
- [NeurIPS 2023] AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis☆26Updated 11 months ago
- Bidirectional Mapping between Action Physical-Semantic Space☆30Updated 4 months ago
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆35Updated 2 months ago
- ☆77Updated 7 months ago
- ☆20Updated 10 months ago
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆90Updated 2 months ago
- Hearing Anything Anywhere Code Release☆32Updated 7 months ago
- A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.☆96Updated last year
- ☆56Updated last year
- The official instructions of HOI4D dataset.☆56Updated last year
- Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)☆32Updated 4 months ago
- ☆16Updated 6 months ago
- Official implementation for "Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches" (CVPR 2024)☆20Updated 6 months ago
- Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"☆49Updated 9 months ago
- [ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding☆43Updated 2 years ago
- SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)☆18Updated 2 years ago
- The official implementation of work "AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward".☆10Updated last month
- ☆22Updated last month
- [NeurIPS 2024] Official code repository for MSR3D paper☆31Updated 3 weeks ago
- Official Implementation of paper "Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence"☆109Updated 2 months ago
- Official implementation of the NeurIPS 2023 paper "Self-supervised Object-Centric Learning for Videos"☆26Updated last month
- ☆31Updated 10 months ago
- Official implementation of Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (NeurIPS'22).☆57Updated 2 years ago
- (ICCV2023) IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation☆108Updated last year
- ☆18Updated 3 months ago
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds☆53Updated last year
- ☆32Updated 3 months ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆36Updated last year
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆81Updated last year
- Binding Touch to Everything: Learning Unified Multimodal Tactile Representations☆26Updated 10 months ago