Code for Retrieval-Augmented Perception (ICML 2025)
☆69Apr 22, 2026Updated 2 weeks ago
Alternatives and similar repositories for RAP
Users that are interested in RAP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆48Mar 2, 2026Updated 2 months ago
- Code for LLM_Catastrophic_Forgetting via SAM.☆11Jun 7, 2024Updated last year
- 🚀enhanced GRPO with more verifiable rewards and real-time evaluators☆37Jan 27, 2026Updated 3 months ago
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆80Nov 20, 2025Updated 5 months ago
- The official implementation of InfoRM [NeurIPS 2024].☆15Oct 25, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The code for the paper "Dual Mutual Information Constraints for Discriminative Clustering"☆23Aug 22, 2024Updated last year
- Official repo for [NeurlPS 2025 Spotlight] "GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution"☆48Oct 27, 2025Updated 6 months ago
- Official repo for ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models☆28Mar 24, 2025Updated last year
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆38Apr 7, 2025Updated last year
- Official code base for "Long-Tailed Diffusion Models With Oriented Calibration" ICLR2024☆17Jul 11, 2024Updated last year
- ☆47Apr 13, 2026Updated 3 weeks ago
- ☆22Sep 23, 2025Updated 7 months ago
- ☆13Apr 23, 2025Updated last year
- (TGRS 2024) OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images☆48Jul 14, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The official implementation for Candidate Set Re-ranking for Composed Image Retrieval (TMLR) 01/2024☆20Feb 7, 2024Updated 2 years ago
- [CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"☆144Jun 20, 2024Updated last year
- [ISPRS 2024] LoveNAS: Towards Multi-Scene Land-Cover Mapping via Hierarchical Searching Adaptive Network☆33Dec 1, 2024Updated last year
- ☆29Feb 10, 2025Updated last year
- Code for WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge☆17Dec 31, 2024Updated last year
- Creating High-Fidelity Synthetic GPS Trajectory Dataset for Urban Mobility Analysis☆22Mar 12, 2026Updated last month
- ☆24Jun 18, 2025Updated 10 months ago
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆44Jul 2, 2025Updated 10 months ago
- [ICML-2025] Official Pytorch implementation of "Perceptual-GS: Scene-adaptive Perceptual Densification for Gaussian Splatting"☆26Mar 31, 2026Updated last month
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Official code of paper "GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis" [ICCV 2025]☆45Jun 29, 2025Updated 10 months ago
- [ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"☆45Feb 27, 2026Updated 2 months ago
- [ICML 2024] Official Implementation of Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation☆13Jul 13, 2024Updated last year
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆368Apr 20, 2025Updated last year
- ☆24Nov 29, 2024Updated last year
- Official implementation of EgoThinker at NIPS 2025☆26Nov 25, 2025Updated 5 months ago
- ☆13Nov 2, 2025Updated 6 months ago
- [ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"☆38Jul 12, 2024Updated last year
- This is the official code of "Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation, NeurIPS 23"☆26Dec 7, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆25Mar 10, 2026Updated last month
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆22Jun 12, 2025Updated 10 months ago
- SkillRouter: Retrieve-and-Rerank Skill Selection for LLM Agents at Scale☆63Apr 28, 2026Updated last week
- [CVPR 2025] Official Pytorch implementation of "Learning with Noisy Triplet Correspondence for Composed Image Retrieval".☆24Jun 9, 2025Updated 10 months ago
- [CVPR 2026] ZoomEarth: Active Perception for Ultra-High-Resolution Geospatial Vision-Language Tasks☆36Apr 9, 2026Updated 3 weeks ago
- The official code of "CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval"☆15Sep 19, 2024Updated last year
- Research work aimed at addressing the problem of modeling infinite-length context☆48Dec 18, 2025Updated 4 months ago