TencentARC / Plot2Code
☆16Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for Plot2Code
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆38Updated 7 months ago
- MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆17Updated 2 weeks ago
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆35Updated 10 months ago
- ☆45Updated last year
- Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Lan…☆34Updated 4 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆47Updated last month
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆62Updated 3 weeks ago
- ☆60Updated last year
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆70Updated 2 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆38Updated 4 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 3 months ago
- Lottery Ticket Adaptation☆36Updated last month
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆46Updated 2 weeks ago
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆17Updated 4 months ago
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated last month
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆45Updated last month
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆37Updated last month
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆56Updated last month
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆59Updated 5 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆30Updated 9 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆36Updated last month
- ☆35Updated 3 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆48Updated 7 months ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers"☆40Updated last month
- [ICML 2024 Oral] A framework for society simulation that supports complex simulation, for example: multi-scene.☆52Updated 3 months ago
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆115Updated last week
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆17Updated last month
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 8 months ago
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆14Updated last month
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆69Updated last month