infly-ai / INF-MLLM
☆54Updated 8 months ago
Alternatives and similar repositories for INF-MLLM:
Users that are interested in INF-MLLM are comparing it to the libraries listed below
- Touchstone: Evaluating Vision-Language Models by Language Models☆82Updated last year
- ☆20Updated last year
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆69Updated 6 months ago
- ☆98Updated last year
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆22Updated last year
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆115Updated 5 months ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆260Updated 10 months ago
- SVIT: Scaling up Visual Instruction Tuning☆163Updated 10 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆42Updated 9 months ago
- ☆91Updated last year
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆49Updated 5 months ago
- ☆63Updated last year
- A collection of visual instruction tuning datasets.☆76Updated last year
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆46Updated 11 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆100Updated last month
- ☆73Updated 3 months ago
- ☆55Updated 8 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆44Updated 4 months ago
- 🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)☆64Updated last year
- The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.☆22Updated last year
- ☆87Updated 9 months ago
- ☆47Updated last month
- ☆51Updated last year
- MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale☆39Updated 4 months ago
- Dataset pruning for ImageNet and LAION-2B.☆78Updated 9 months ago
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆108Updated this week
- ☆73Updated last year
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆27Updated last year
- ☆133Updated last year
- Lion: Kindling Vision Intelligence within Large Language Models☆52Updated last year