JiabenChen / iQuery
[CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation
☆61Updated last year
Related projects: ⓘ
- [NeurIPS 2023] AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis☆19Updated 7 months ago
- ☆67Updated 3 months ago
- A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.☆95Updated last year
- Official Implementation of paper "Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence"☆81Updated last week
- ☆14Updated 2 months ago
- The official instructions of HOI4D dataset.☆48Updated last year
- ☆52Updated last year
- Code release for PianoMotion10M☆50Updated last month
- [ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding☆38Updated 2 years ago
- [ECCV 2024] Code for Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation☆24Updated 2 months ago
- ☆14Updated last month
- For Ego4D VQ3D Task☆16Updated 8 months ago
- Official implementation of Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (NeurIPS'22).☆50Updated last year
- (ICCV2023) IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation☆107Updated 9 months ago
- ☆16Updated 9 months ago
- Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)☆27Updated last week
- [ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining☆124Updated last month
- Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"☆50Updated 5 months ago
- ☆99Updated last year
- [NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding☆62Updated 5 months ago
- ☆39Updated last year
- [ICLR 2023] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?☆97Updated 2 months ago
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆19Updated 3 weeks ago
- Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"☆38Updated last month
- SceneFun3D ToolKit☆58Updated 4 months ago
- Bidirectional Mapping between Action Physical-Semantic Space☆25Updated 2 weeks ago
- [ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds☆38Updated 2 years ago
- [NeurIPS 2023] Weakly Supervised 3D Open-vocabulary Segmentation☆102Updated 8 months ago
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds☆51Updated last year
- Independent PyTorch Implementation of Object Scene Representation Transformer☆44Updated last year