Alpha-Innovator / DocGenome
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
☆127Updated last month
Alternatives and similar repositories for DocGenome:
Users that are interested in DocGenome are comparing it to the libraries listed below
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆253Updated 10 months ago
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆618Updated this week
- Improving Generalist Model with Domain-Specific Experts☆82Updated last month
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆96Updated 7 months ago
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆170Updated 3 months ago
- [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models