HITsz-TMG / VisionGraphLinks
The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context"
☆14Updated last year
Alternatives and similar repositories for VisionGraph
Users that are interested in VisionGraph are comparing it to the libraries listed below
Sorting:
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆17Updated 2 months ago
- code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning☆16Updated last year
- Unsupervised GRPO☆39Updated last month
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆77Updated 8 months ago
- ☆17Updated 11 months ago
- WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆27Updated last month
- ☆22Updated last year
- [ICML 2025] Official resources of "KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search".☆26Updated 2 months ago
- [ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning☆19Updated 3 weeks ago
- ☆18Updated 4 months ago
- ☆21Updated 2 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Updated 9 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 8 months ago
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆15Updated 11 months ago
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆40Updated this week
- ☆31Updated last year
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆27Updated this week
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 5 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated 2 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆35Updated last week
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆26Updated 7 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆113Updated 3 weeks ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆74Updated 7 months ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆24Updated 7 months ago
- ☆19Updated 4 months ago
- A trainable user simulator☆34Updated 2 weeks ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆35Updated 2 weeks ago
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆23Updated last week
- A curated list of the latest advancements, papers, tools, and datasets for **Multimodal Retrieval-Augmented Generation (RAG)**. Multimoda…☆22Updated 6 months ago