EnVision-Research / A4-AgentLinks
A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning
☆28Updated last month
Alternatives and similar repositories for A4-Agent
Users that are interested in A4-Agent are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] AnyI2V: Animating Any Conditional Image with Motion Control Generation☆119Updated 5 months ago
- Code of BRIDGE: Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation☆116Updated 3 months ago
- [ICCV 2025] MOVE: Motion-Guided Few-Shot Video Object Segmentation☆87Updated 4 months ago
- [ICCV 2025] Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation☆55Updated 5 months ago
- ☆53Updated 2 months ago
- [ICCV 2025] Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation☆82Updated 3 months ago
- ☆70Updated 6 months ago
- Official implementation of "Robo-Dopamine: General Process Reward Modeling for High-Precision Robotic Manipulation"☆154Updated this week
- [NeurIPS 2025] Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles☆100Updated 2 months ago
- ☆39Updated 2 months ago
- [CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation☆86Updated last year
- Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"☆269Updated 2 weeks ago
- Local nonlinear causal attention latent diffusion models for visual story synthesizing☆29Updated 9 months ago
- ☆76Updated 3 months ago
- ☆26Updated 2 months ago
- RealSee3D: A multi-view RGB-D dataset combining real-world captures and procedurally generated scenes, with extensible annotations for di…☆226Updated last month
- A Survey of Image Editing☆463Updated 5 months ago
- this is a tool and a displayer that allows us to place the 3D model and reshape them.☆14Updated 2 years ago
- [NeurIPS 2025 spotlight] QFFT, Question-Free Fine-Tuning for Adaptive Reasoning☆91Updated 2 months ago
- [ACM MM-2024] RefMask3D: Language-Guided Transformer for 3D Referring Segmentation☆66Updated last year
- Multimodal Referring Segmentation☆201Updated last month
- Official implementation of [OmniNav: A Unified Framework for Prospective Exploration and Visual-Language Navigation]☆63Updated 3 weeks ago
- A benchmark dataset for GREx: GRES, GREC, and GREG [CVPR 2023 & IJCV 2026]☆239Updated 2 months ago
- [ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes☆361Updated 4 months ago
- ☆109Updated last week
- [AAAI 2025] MultiBooth: This repo is the official implementation of "MultiBooth: Towards Generating All Your Concepts in an Image from Te…☆118Updated last year
- An open-source agentic framework that enables AI to use computers like humans and can provide a multi-agent runtime environment as an inf…☆114Updated last week
- Official repo for [NeurlPS 2025] "DGSolver: Diffusion Generalist Solver with Universal Posterior Sampling for Image Restoration"☆139Updated 8 months ago
- [CVPR-2023] Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation☆18Updated 2 years ago
- ☆117Updated 2 months ago