PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models"
☆40Mar 2, 2026Updated this week
Alternatives and similar repositories for HR-Bench
Users that are interested in HR-Bench are comparing it to the libraries listed below
Sorting:
- Code for Retrieval-Augmented Perception (ICML 2025)☆68Aug 10, 2025Updated 6 months ago
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆346Apr 20, 2025Updated 10 months ago
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆77Nov 20, 2025Updated 3 months ago
- Code for WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge☆15Dec 31, 2024Updated last year
- [ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"☆44Feb 27, 2026Updated last week
- Expression Snippet Transformer for Robust Video-based Facial Expression Recognition☆17Jan 27, 2024Updated 2 years ago
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆20Jan 11, 2026Updated last month
- Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Prompting☆20Mar 25, 2024Updated last year
- Towards Robust Multimodal Sentiment Analysis with Incomplete Data☆105Feb 24, 2026Updated last week
- ☆46Feb 18, 2026Updated 2 weeks ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆45Jul 22, 2025Updated 7 months ago
- PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"☆691Jan 7, 2024Updated 2 years ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆94Dec 1, 2025Updated 3 months ago
- [ICLR 2026] The official implementation of the paper “Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents”☆97Feb 1, 2026Updated last month
- A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo☆34Aug 12, 2024Updated last year
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆91Nov 15, 2024Updated last year
- [MQM-APE] Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators.☆11Sep 24, 2024Updated last year
- This is the class in matlab for convex optimization algorithms☆10Nov 19, 2023Updated 2 years ago
- Code for ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment☆11Feb 28, 2024Updated 2 years ago
- [PR 2024] Official PyTorch Code for "Dual Teachers for Self-Knowledge Distillation"☆13Nov 28, 2024Updated last year
- Gesture Recognition Based on ALTERA DE2-115 FPGA☆10Mar 18, 2014Updated 11 years ago
- [ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models☆46Jan 8, 2025Updated last year
- [WMT 2022 champion system] Vega-MT model and inference scripts☆41Feb 10, 2023Updated 3 years ago
- [ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models☆125Oct 14, 2025Updated 4 months ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆42Dec 16, 2025Updated 2 months ago
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆430Dec 22, 2024Updated last year
- Quick Long Video Understanding [TMLR2025]☆76Oct 27, 2025Updated 4 months ago
- ☆10May 16, 2024Updated last year
- ☆11Jul 30, 2025Updated 7 months ago
- ☆11Jan 19, 2025Updated last year
- The source code for “Homophily-Related: Adaptive Hybrid Graph Filter for Multi-View Graph Clustering”☆10Apr 10, 2024Updated last year
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- Finetuning Stable Diffusion from Diffusers☆12Mar 11, 2024Updated last year
- The first attempt to Marine Open Vocabulary Instance Segmentation☆36Feb 24, 2026Updated last week
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 2 years ago
- 基于大语言模型的自动综述生成\nAutomatic Review Generation Method based on Large Language Models☆18Jun 22, 2025Updated 8 months ago
- [ICLR 2025] RaSA: Rank-Sharing Low-Rank Adaptation☆10May 19, 2025Updated 9 months ago
- [TNNLS 2022] Official pytorch implementation of "Tackling the Challenges in Scene Graph Generation with Local-to-Global Interactions"☆11Apr 19, 2022Updated 3 years ago
- Code and data for EMNLP2019 Paper "Uncover the Ground-Truth Relations in Distant Supervision: A Neural Expectation-Maximization Framework…☆10May 24, 2020Updated 5 years ago