opendatalab/OHR-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/opendatalab/OHR-Bench)

opendatalab / OHR-Bench

(ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

☆104

Alternatives and similar repositories for OHR-Bench

Users that are interested in OHR-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZichenWen1 / DIJA
View on GitHub
(ICLR 2026 🔥) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"
☆79Feb 9, 2026Updated 5 months ago
ZichenWen1 / AHGFC
View on GitHub
The source code for “Homophily-Related: Adaptive Hybrid Graph Filter for Multi-View Graph Clustering”
☆11Apr 10, 2024Updated 2 years ago
opendatalab / labelbee
View on GitHub
☆25Nov 7, 2022Updated 3 years ago
opendatalab / CLIP-Parrot-Bias
View on GitHub
ECCV2024_Parrot Captions Teach CLIP to Spot Text
☆66Sep 6, 2024Updated last year
NEUIR / M2RAG
View on GitHub
[MM '25] This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".
☆44Sep 27, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
opendatalab / LEGION
View on GitHub
[ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"
☆82Oct 22, 2025Updated 9 months ago
zyang-ur / idea2img
View on GitHub
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation, ECCV 2024
☆22Feb 15, 2024Updated 2 years ago
OpenDCAI / Flash-MinerU
View on GitHub
Ray-powered accelerator for MinerU, turning PDF → Markdown into a scalable, cluster-ready data infrastructure. 基于 Ray 的 MinerU 加速层，将 PDF …
☆65Apr 20, 2026Updated 3 months ago
ali-bahrainian / RAG_best_practices
View on GitHub
☆107Mar 25, 2025Updated last year
opendatalab / OmniDocBench
View on GitHub
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
☆1,914Updated this week
opendatalab / MLLM-DataEngine
View on GitHub
MLLM-DataEngine: An Iterative Refinement Approach for MLLM
☆49May 24, 2024Updated 2 years ago
opendatalab / dsdl-docs
View on GitHub
Data Set Description Language Specification （新一代人工智能数据集描述语言DSDL）
☆46May 29, 2024Updated 2 years ago
Niujunbo2002 / NativeRes-LLaVA
View on GitHub
Official code repo for our work "Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models"
☆55Jun 17, 2025Updated last year
OpenMatch / MARVEL
View on GitHub
[ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…
☆39Jun 30, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NEUIR / MemGraph
View on GitHub
[SIGIR '25] This is the code repo for our SIGIR '25 paper: Enhancing the Patent Matching Capability of Large Language Models via Memory G…
☆19Apr 22, 2025Updated last year
opendatalab / VIGC
View on GitHub
AAAI 2024: Visual Instruction Generation and Correction
☆97Feb 4, 2024Updated 2 years ago
OpenBMB / RAG-DDR
View on GitHub
This is the code repo for the paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards".
☆23Oct 28, 2024Updated last year
OpenBMB / DEBATER
View on GitHub
This is the code repo for our paper "Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Searc…
☆26Mar 2, 2025Updated last year
opendatalab / TRivia
View on GitHub
(CVPR 2026) TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
☆35Jul 14, 2026Updated last week
Lucanyc / VISTA-Gym
View on GitHub
☆27Mar 17, 2026Updated 4 months ago
NEUIR / HIPPO
View on GitHub
HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization
☆18May 29, 2025Updated last year
193746 / VHASR
View on GitHub
☆11Oct 31, 2024Updated last year
OpenMatch / TASTE
View on GitHub
[CIKM 2023 Oral] This is the code repo for our CIKM‘23 paper "Text Matching Improves Sequential Recommendation by Reducing Popularity Bia…
☆40Mar 17, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
opendatalab / opendatalab-datasets
View on GitHub
datasets resource
☆150May 27, 2026Updated last month
NEUIR / UNIKIE-BENCH
View on GitHub
[ACL '26] Source code and datasets for our paper "UNIKIE-BENCH: Benchmarking Large Multimodal Models for Key Information Extraction in Vi…
☆17Apr 28, 2026Updated 2 months ago
pavanjava / healthcare_agentic_rag
View on GitHub
this repository provides the agentic rag powered by crewai and qdrant and applied for health care industry.
☆18Jan 11, 2025Updated last year
opendatalab / mineru-vl-utils
View on GitHub
A Python package for interacting with the MinerU Vision-Language Model.
☆136Jun 11, 2026Updated last month
NEUIR / KAIR
View on GitHub
Source code for our paper ''Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval''
☆29Jun 2, 2026Updated last month
JulioZhao97 / EffTrans_Fsdet
View on GitHub
This is a repository for ACMMM22 paper "Exploring Effective Knowledge Transfer for Few-shot Object Detection"
☆18Jun 21, 2023Updated 3 years ago
shaharl6000 / MoreDocsSameLen
View on GitHub
This repository contains code and datasets for our paper on the effects of document multiplicity while the context size is fixed in Retri…
☆18Mar 13, 2025Updated last year
conghui / replaycode
View on GitHub
ReplayCode — first open-source rebuild of Claude Code that actually runs. Built from decompiled source with Node.js/esbuild
☆20Apr 1, 2026Updated 3 months ago
zc-97 / SSDRec
View on GitHub
☆11Jul 28, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
opendatalab / DocLayout-YOLO
View on GitHub
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆2,235Apr 14, 2025Updated last year
hychaochao / Chat-Models-Backdoor-Attacking
View on GitHub
Code for the paper "Exploring Backdoor Vulnerabilities of Chat Models"
☆19Apr 13, 2024Updated 2 years ago
hanwenzhu / dreamhoi
View on GitHub
DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors
☆37Sep 13, 2024Updated last year
MJ-Bench / MJ-Bench
View on GitHub
(NeurIPS 2025) Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"
☆51Jun 3, 2025Updated last year
HKUST-KnowComp / PseudoReasoner
View on GitHub
Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…
☆11Oct 18, 2022Updated 3 years ago
THUNLP-MT / Brote
View on GitHub
☆11Jan 19, 2025Updated last year
OpenBMB / VisRAG
View on GitHub
Parsing-free RAG supported by VLMs
☆973Jul 17, 2026Updated last week