zijianchen98 / OBI-BenchLinks
[ICLR'25] The first benchmark aiming to evaluate whether LMMs can assist oracle bone inscription processing tasks
☆20Updated 7 months ago
Alternatives and similar repositories for OBI-Bench
Users that are interested in OBI-Bench are comparing it to the libraries listed below
Sorting:
- Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)☆74Updated 7 months ago
- [arXiv 25] Aesthetics is Cheap, Show me the Text: An Empirical Evaluation of State-of-the-Art Generative Models for OCR☆236Updated 2 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆82Updated last year
- ☆12Updated last year
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆26Updated 3 months ago
- EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling☆156Updated last week
- Dreambooth (LoRA) with well-organized code structure. Naive adaptation from 🤗Diffusers.☆14Updated 2 years ago
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆21Updated 3 months ago
- [ICML 2024] On Discrete Prompt Optimization for Diffusion Models - Google☆62Updated last year
- Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation☆33Updated 4 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆43Updated last year
- Compositional Inversion for Stable Diffusion Models (AAAI 2024)☆36Updated 8 months ago
- [NeurIPS 2025 Spotlight] Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆71Updated last month
- Assessing Context-Aware Creative Intelligence in MLLMs☆23Updated 3 months ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆88Updated last year
- Continuous diffusion for layout generation☆50Updated 8 months ago
- [WWW 2025] Official PyTorch Code for "CTR-Driven Advertising Image Generation with Multimodal Large Language Models"☆58Updated 3 months ago
- Oracle Bone Script data collected by VLRLab of HUST☆58Updated last year
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆57Updated last year
- (arXiv.2405.18406) RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives☆36Updated last year
- ☆52Updated 7 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆63Updated last year
- Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"☆24Updated 9 months ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆65Updated last year
- A collection of AI-generated images papers and corresponding source code/demo program, including text-to-image, image translation (e.g., …☆13Updated last year
- a collection of awesome autoregressive visual generation models☆78Updated 6 months ago
- A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation☆81Updated last month
- [ACL 2025 main] The official GitHub page of "Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restorati…☆47Updated last month
- [NeurIPS 2024] Official Code for the Paper "Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning"☆24Updated 7 months ago
- 🔥🔥[NeurIPS2025]Exploring and mitigating semantic hallucinations in scene text perception and reasoning☆18Updated last month