mat-agent / MAT-Agent
☆30Updated last week
Alternatives and similar repositories for MAT-Agent:
Users that are interested in MAT-Agent are comparing it to the libraries listed below
- ☆72Updated 10 months ago
- A Self-Training Framework for Vision-Language Reasoning☆76Updated 3 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆72Updated 5 months ago
- ☆22Updated 2 months ago
- A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.☆56Updated last month
- The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark"☆51Updated this week
- ☆41Updated 2 weeks ago
- ☆93Updated last week
- ☆145Updated 5 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆46Updated 5 months ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆54Updated 8 months ago
- The official code repository for PRMBench.☆72Updated 2 months ago
- A RLHF Infrastructure for Vision-Language Models☆171Updated 5 months ago
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆45Updated 6 months ago
- ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation☆34Updated 3 weeks ago
- M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆56Updated 4 months ago
- ☆73Updated 3 months ago
- A comprehensive collection of process reward models.☆67Updated this week
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆71Updated 10 months ago
- A curated collection of resources, tools, and frameworks for developing GUI Agents.☆24Updated this week
- [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.☆94Updated 8 months ago
- ☆60Updated this week
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆100Updated 3 weeks ago
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆31Updated 4 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆86Updated last year
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆22Updated last week
- [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding☆267Updated 6 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆47Updated last month
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆82Updated last year
- ☆113Updated 2 months ago