[ICLR2025] Code Release of Refining CLlP's Spatial Awareness: A Visual-centric Perspective
☆20Apr 11, 2025Updated 11 months ago
Alternatives and similar repositories for CLIPRefiner
Users that are interested in CLIPRefiner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for CVPR2025 paper: Generating Multimodal Driving Scenes via Next-Scene Prediction☆104Nov 7, 2025Updated 4 months ago
- Official Implement of the work "Coherent and Multi-modality Image Inpainting via Latent Space Optimization"☆55Apr 10, 2025Updated 11 months ago
- Will Pre-Training Ever End? A First Step Toward Next-Generation Foundation MLLMs via Self-Improving Systematic Cognition☆31May 14, 2025Updated 10 months ago
- Code for Point-Level Regin Contrast (https//arxiv.org/abs/2202.04639)☆35Dec 23, 2022Updated 3 years ago
- PyTorch implementation of Supercombo, an end-to-end model for Level 2 autonomous driving on a single device (OpenPilot)☆13Jun 27, 2022Updated 3 years ago
- ☆11Apr 28, 2024Updated last year
- Code of the paper "Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation"…☆17Nov 11, 2025Updated 4 months ago
- The project presents a drone obstacle avoidance system using Microsoft AirSim and the DDPG algorithm, training drones with LIDAR and dept…☆21May 22, 2024Updated last year
- ☆21May 21, 2020Updated 5 years ago
- A curated publication list on visual dialog☆14May 8, 2023Updated 2 years ago
- ☆32Jun 1, 2023Updated 2 years ago
- ☆24Feb 23, 2026Updated last month
- The goal of this project is to make a prediction model which will predict whether an athlete will win a medal or not.☆10Sep 17, 2021Updated 4 years ago
- 带禁手的五子棋(中国科学院大学 杨力祥老师的公选课《C++程序设计》大作业)☆18Feb 24, 2022Updated 4 years ago
- [ICCV 2025] Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction☆52Sep 22, 2025Updated 6 months ago
- Code for "Grounded Vision-Language Navigation for UAVs with Open-Vocabulary Goal Understanding"☆58Mar 17, 2026Updated last week
- ☆28Jan 27, 2025Updated last year
- [CVPR 2026] Official implementation of "ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models"☆71Feb 28, 2026Updated 3 weeks ago
- A torch-based implementation of K-Means and K-Means++☆17Dec 6, 2020Updated 5 years ago
- [ICRA 2026] Official implemetation of the paper "InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning"☆48Feb 2, 2026Updated last month
- Official implementation of "g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks" (CVPR'25).☆47Jul 14, 2025Updated 8 months ago
- ☆44Nov 13, 2025Updated 4 months ago
- ☆67May 5, 2025Updated 10 months ago
- EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models☆80Dec 17, 2025Updated 3 months ago
- Add Chroma architecture to forge☆41Jun 24, 2025Updated 9 months ago
- [CVPR 2022 Oral] Towards Open Set Temporal Action Localization☆54Sep 4, 2023Updated 2 years ago
- [CVPR 2023] Official repository for paper "Stare at What You See: Masked Image Modeling without Reconstruction"☆70Jul 2, 2025Updated 8 months ago
- PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.☆34Sep 26, 2024Updated last year
- Review automated kernel generation in the era of LLMs☆148Feb 28, 2026Updated 3 weeks ago
- Official implementation of Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation (CoRL'24).☆76Dec 26, 2025Updated 2 months ago
- Hierarchical Universal Language Conditioned Policies☆77Mar 19, 2024Updated 2 years ago
- EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]☆130Feb 6, 2026Updated last month
- Official implementation of ImageCritic (CVPR 2026)☆156Mar 7, 2026Updated 2 weeks ago
- [ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback☆59Nov 8, 2024Updated last year
- [CVPR'25] Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model (DFD-FCG)☆49Jul 20, 2025Updated 8 months ago
- Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).☆82Jun 11, 2024Updated last year
- Official codes for paper: Localizing Anomalies from Weakly-Labeled Videos☆85Sep 23, 2022Updated 3 years ago
- Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"☆208Feb 1, 2026Updated last month
- Official Code For VLA-OS.☆144Jun 25, 2025Updated 9 months ago