[ICLR2025] Code Release of Refining CLlP's Spatial Awareness: A Visual-centric Perspective
☆21Apr 11, 2025Updated last year
Alternatives and similar repositories for CLIPRefiner
Users that are interested in CLIPRefiner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for CVPR2025 paper: Generating Multimodal Driving Scenes via Next-Scene Prediction☆107Nov 7, 2025Updated 7 months ago
- Official Implement of the work "Coherent and Multi-modality Image Inpainting via Latent Space Optimization"☆55Apr 10, 2025Updated last year
- Code for Point-Level Regin Contrast (https//arxiv.org/abs/2202.04639)☆35Dec 23, 2022Updated 3 years ago
- [NeurIPS'25] FlySearch: Exploring how vision-language models explore☆24Mar 12, 2026Updated 3 months ago
- Code of the paper "Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation"…☆20Nov 11, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Video-Language Continual Learning Benchmark☆20Oct 30, 2024Updated last year
- A curated publication list on visual dialog☆14May 8, 2023Updated 3 years ago
- ☆26Feb 23, 2026Updated 3 months ago
- The goal of this project is to make a prediction model which will predict whether an athlete will win a medal or not.☆10Sep 17, 2021Updated 4 years ago
- 带禁手的五子棋(中国科学院大学 杨力祥老师的公选课《C++程序设计》大作业)☆18Feb 24, 2022Updated 4 years ago
- Code for MICCAI2023 paper: TransLiver: A Hybrid Transformer Model for Multi-phase Liver Lesion Classification☆18Jan 10, 2024Updated 2 years ago
- A minimalist (educational) implementation of Latent Diffusion Models (LDM) with PyTorch distributed training.☆13Dec 22, 2024Updated last year
- Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation☆35Dec 5, 2025Updated 6 months ago
- ☆100Aug 28, 2022Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A torch-based implementation of K-Means and K-Means++☆17Dec 6, 2020Updated 5 years ago
- [ICRA 2026] Official implemetation of the paper "InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning"☆49Feb 2, 2026Updated 4 months ago
- ☆44Nov 13, 2025Updated 7 months ago
- ☆79May 5, 2025Updated last year
- Official implementation of "g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks" (CVPR'25).☆55Jul 14, 2025Updated 11 months ago
- EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models☆86Dec 17, 2025Updated 5 months ago
- [CVPR 2022 Oral] Towards Open Set Temporal Action Localization☆55Sep 4, 2023Updated 2 years ago
- PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.☆34Sep 26, 2024Updated last year
- ☆53Jan 3, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official pytorch implementation of "AlphaFlow: Understanding and Improving MeanFlow Models"☆121Oct 24, 2025Updated 7 months ago
- Official implementation of Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation (CoRL'24).☆78Dec 26, 2025Updated 5 months ago
- Hierarchical Universal Language Conditioned Policies☆78Mar 19, 2024Updated 2 years ago
- Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch☆55Nov 25, 2024Updated last year
- EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]☆147Apr 11, 2026Updated 2 months ago
- Mini-Kinetics-200 data splits used in paper "Rethinking Spatiotemporal Feature Learning For Video Understanding"☆80Dec 24, 2017Updated 8 years ago
- Official implementation of ImageCritic (CVPR 2026)☆163Jun 4, 2026Updated last week
- [ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback☆60Nov 8, 2024Updated last year
- Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).☆82Jun 11, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official codes for paper: Localizing Anomalies from Weakly-Labeled Videos☆86Sep 23, 2022Updated 3 years ago
- [CVPR'25] Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model (DFD-FCG)☆56Jul 20, 2025Updated 10 months ago
- This is the official repo of paper accepted in AAAI 2023 Oral.☆93Apr 19, 2023Updated 3 years ago
- ☆65Oct 11, 2023Updated 2 years ago
- PyTorch Implementation of DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning @ ECCV22☆126Feb 12, 2023Updated 3 years ago
- A curated list of awesome Vision-and-Language Navigation(VLN) resources (continually updated)☆115Mar 9, 2025Updated last year
- [ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "☆114Aug 3, 2023Updated 2 years ago