zijianchen98 / OBI-BenchLinks
[ICLR'25] The first benchmark aiming to evaluate whether LMMs can assist oracle bone inscription processing tasks
☆20Updated 8 months ago
Alternatives and similar repositories for OBI-Bench
Users that are interested in OBI-Bench are comparing it to the libraries listed below
Sorting:
- ☆13Updated last year
- [arXiv 25] Aesthetics is Cheap, Show me the Text: An Empirical Evaluation of State-of-the-Art Generative Models for OCR☆242Updated 3 months ago
- Oracle Bone Script data collected by VLRLab of HUST☆61Updated last year
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆84Updated last year
- [IJCV 2025] Smaller But Better: Unifying Layout Generation with Smaller Large Language Models☆150Updated 3 months ago
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆57Updated last year
- [ICLR'25] Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training☆46Updated 10 months ago
- EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling☆171Updated last week
- [ICML 2024] On Discrete Prompt Optimization for Diffusion Models - Google☆62Updated last year
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆22Updated 4 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆44Updated last year
- Assessing Context-Aware Creative Intelligence in MLLMs☆23Updated 4 months ago
- Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)☆74Updated 7 months ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆90Updated last year
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆26Updated 4 months ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Updated 4 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆113Updated 6 months ago
- LMM solved catastrophic forgetting, AAAI2025☆44Updated 7 months ago
- [ICCV2025] A Token-level Text Image Foundation Model for Document Understanding☆124Updated 3 months ago
- A collection of AI-generated images papers and corresponding source code/demo program, including text-to-image, image translation (e.g., …☆13Updated 2 years ago
- [PR 2025] The official GitHub page of "MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Ca…☆69Updated 4 months ago
- Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation☆36Updated 4 months ago
- Continuous diffusion for layout generation☆52Updated 9 months ago
- ☆38Updated 4 months ago
- Training A Small Emotional Vision Language Model for Visual Art Comprehension☆15Updated last year
- [ICML 2025 Spotlight] MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding☆61Updated 4 months ago
- SFT+RL boosts multimodal reasoning☆37Updated 5 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆33Updated 8 months ago
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆74Updated last week
- ☆27Updated 2 weeks ago