WildVision-AI / LMM-Engines
☆14Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for LMM-Engines
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆57Updated 5 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆46Updated 3 weeks ago
- ☆53Updated 2 months ago
- [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.☆65Updated 3 months ago
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆56Updated 3 weeks ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆57Updated last month
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆33Updated 11 months ago
- Official github repo of G-LLaVA☆121Updated 5 months ago
- ☆84Updated 10 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆107Updated 4 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆26Updated 4 months ago
- Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Lan…☆34Updated 4 months ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆45Updated 2 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆127Updated last month
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers"☆40Updated last month
- A Survey on the Honesty of Large Language Models☆44Updated last month
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆106Updated last month
- ☆121Updated 2 weeks ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆67Updated 4 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆95Updated 2 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆43Updated 2 weeks ago
- An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation☆93Updated 9 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆63Updated 9 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆72Updated 6 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆16Updated 5 months ago
- [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…☆242Updated last week
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆68Updated last month
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆26Updated 4 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆54Updated 3 months ago
- Self-Alignment with Principle-Following Reward Models☆148Updated 8 months ago