zijianchen98 / OBI-BenchLinks
[ICLR'25] The first benchmark aiming to evaluate whether LMMs can assist oracle bone inscription processing tasks
☆20Updated 9 months ago
Alternatives and similar repositories for OBI-Bench
Users that are interested in OBI-Bench are comparing it to the libraries listed below
Sorting:
- [arXiv 25] Aesthetics is Cheap, Show me the Text: An Empirical Evaluation of State-of-the-Art Generative Models for OCR☆245Updated 4 months ago
- ☆13Updated last year
- Continuous diffusion for layout generation☆52Updated 10 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆84Updated last year
- Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)☆75Updated 9 months ago
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆22Updated 5 months ago
- [ICLR'25] Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training☆46Updated 11 months ago
- [ACL 2025 main] The official GitHub page of "Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restorati…☆51Updated 3 weeks ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆92Updated last year
- [ICML 2024] On Discrete Prompt Optimization for Diffusion Models - Google☆63Updated last year
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆29Updated last month
- ☆57Updated last month
- [IJCV 2025] Smaller But Better: Unifying Layout Generation with Smaller Large Language Models☆149Updated 5 months ago
- Compositional Inversion for Stable Diffusion Models (AAAI 2024)☆37Updated 10 months ago
- EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling☆191Updated last month
- Assessing Context-Aware Creative Intelligence in MLLMs☆23Updated 5 months ago
- [NeurIPS 2025 Spotlight] Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆78Updated 3 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆44Updated last year
- Unified Multi-modal IAA Baseline and Benchmark☆91Updated last year
- 🔥🔥[NeurIPS2025]Exploring and mitigating semantic hallucinations in scene text perception and reasoning☆23Updated last month
- [PR 2025] The official GitHub page of "MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Ca…☆72Updated 3 weeks ago
- A collection of AI-generated images papers and corresponding source code/demo program, including text-to-image, image translation (e.g., …☆13Updated 2 years ago
- [ACMMM 2024] AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception☆98Updated 11 months ago
- Training A Small Emotional Vision Language Model for Visual Art Comprehension☆15Updated last year
- This is the official implementation of 2024 CVPR paper "EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models".☆91Updated 2 months ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆59Updated last month
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆78Updated last month
- [ICCV2025] A Token-level Text Image Foundation Model for Document Understanding☆128Updated 4 months ago
- [NeurIPS 2024] Official Code for the Paper "Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning"☆25Updated 9 months ago
- ☆53Updated 9 months ago