neu-vi / FleVRS
FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024
☆19Updated last month
Alternatives and similar repositories for FleVRS:
Users that are interested in FleVRS are comparing it to the libraries listed below
- Official implementation of PARIS3D (Accepted to ECCV 2024).☆21Updated 3 months ago
- Code for our paper: Learning Camera Movement Control from Real-World Drone Videos☆21Updated 3 weeks ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆20Updated 8 months ago
- The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.☆41Updated 2 weeks ago
- Official code for MotionBench☆22Updated last week
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated 9 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆58Updated 3 months ago
- ☆38Updated last year
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆45Updated 3 months ago
- [3DV 2025] Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model☆51Updated 7 months ago
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆90Updated 2 months ago
- ☆24Updated last year
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆35Updated 2 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆31Updated 7 months ago
- Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"☆23Updated last year
- [CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds☆52Updated last year
- Code release for "SegLLM: Multi-round Reasoning Segmentation"☆56Updated last week
- Diffusion Powers Video Tokenizer for Comprehension and Generation☆40Updated last month
- ☆20Updated last week
- Official Code for the NeurIPS'23 paper "3D-Aware Visual Question Answering about Parts, Poses and Occlusions"☆14Updated 3 months ago
- ☆58Updated last year
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated 3 months ago
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆64Updated 3 months ago
- ☆57Updated last year
- [NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding☆66Updated last month
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆70Updated 3 months ago
- (ICLR 2024, CVPR 2024) SparseFormer☆67Updated 2 months ago
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆32Updated 8 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆95Updated 3 weeks ago