Code for Retrieval-Augmented Perception (ICML 2025)
☆69Apr 8, 2026Updated this week
Alternatives and similar repositories for RAP
Users that are interested in RAP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆47Mar 2, 2026Updated last month
- Code for LLM_Catastrophic_Forgetting via SAM.☆11Jun 7, 2024Updated last year
- 🚀enhanced GRPO with more verifiable rewards and real-time evaluators☆37Jan 27, 2026Updated 2 months ago
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆80Nov 20, 2025Updated 4 months ago
- The official implementation of InfoRM [NeurIPS 2024].☆15Oct 25, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The code for the paper "Dual Mutual Information Constraints for Discriminative Clustering"☆23Aug 22, 2024Updated last year
- Expression Snippet Transformer for Robust Video-based Facial Expression Recognition☆17Jan 27, 2024Updated 2 years ago
- Official repo for [NeurlPS 2025 Spotlight] "GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution"☆48Oct 27, 2025Updated 5 months ago
- Official repo for ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models☆28Mar 24, 2025Updated last year
- A vision-language model with bidirectional progressive fusion and global-local alignment for enhanced medical image segmentation.☆17Dec 25, 2025Updated 3 months ago
- Official code base for "Long-Tailed Diffusion Models With Oriented Calibration" ICLR2024☆17Jul 11, 2024Updated last year
- [CVPR2025] Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models☆20Apr 30, 2025Updated 11 months ago
- [MICCAI 2025] Bridging the Gap in Missing Modalities: Leveraging Knowledge Distillation and Style Matching for Brain Tumor Segmentation☆19Jul 13, 2025Updated 8 months ago
- [CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"☆146Jun 20, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆28Feb 10, 2025Updated last year
- Code for WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge☆17Dec 31, 2024Updated last year
- ☆24Jun 18, 2025Updated 9 months ago
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆44Jul 2, 2025Updated 9 months ago
- Towards Robust Multimodal Sentiment Analysis with Incomplete Data☆111Feb 24, 2026Updated last month
- Official code of paper "GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis" [ICCV 2025]☆43Jun 29, 2025Updated 9 months ago
- Official Implementation of "IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models"☆17Jun 5, 2025Updated 10 months ago
- [ICML 2024] Official Implementation of Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation☆13Jul 13, 2024Updated last year
- ☆23Nov 29, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Official implementation of EgoThinker at NIPS 2025☆25Nov 25, 2025Updated 4 months ago
- Evaluation of ML models in Android malware classification, adversarial attacks on DNNs & defense mechanisms☆13Jan 14, 2020Updated 6 years ago
- [ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"☆38Jul 12, 2024Updated last year
- This is the official code of "Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation, NeurIPS 23"☆26Dec 7, 2023Updated 2 years ago
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆22Jun 12, 2025Updated 10 months ago
- [CVPR 2025] Official Pytorch implementation of "Learning with Noisy Triplet Correspondence for Composed Image Retrieval".☆24Jun 9, 2025Updated 10 months ago
- Pytorch implementation for codes in Noise Imitation Based Adversarial Training for Robust Multimodal Sentiment Analysis (Accepted by IEEE…☆14Feb 2, 2024Updated 2 years ago
- [CVPR 2026] ZoomEarth: Active Perception for Ultra-High-Resolution Geospatial Vision-Language Tasks☆34Updated this week
- Working note for WSI analysis☆10Apr 3, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The official code of "CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval"☆15Sep 19, 2024Updated last year
- [TPAMI 2026] Breaking Barriers, Localizing Saliency: A Large-scale Benchmark and Baseline for Condition-Constrained Salient Object Detect…☆27Dec 12, 2025Updated 4 months ago
- The official implementation of DDS2M [ICCV 2023].☆123Jul 15, 2024Updated last year
- Monte Carlo Tree Search Self-Refine (MCTSr)☆22Jul 6, 2024Updated last year
- visual-language reasoning segmentation of function-level building footprint☆19May 17, 2025Updated 10 months ago
- Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval --ICCV2023 Oral☆91Nov 2, 2023Updated 2 years ago
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 8 months ago