ning-mz / SCA-GPS
Code of ACM MM 2023 Paper: A Symbolic Characters Aware Model for Solving Geometry Problems
☆13Updated 8 months ago
Related projects: ⓘ
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆51Updated 3 months ago
- ☆12Updated 2 months ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆32Updated 10 months ago
- Code for our Paper "All in an Aggregated Image for In-Image Learning"☆27Updated 5 months ago
- Multi-modal code generation problems.☆15Updated 2 weeks ago
- ☆15Updated 6 months ago
- A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆31Updated 3 months ago
- Evaluating Mathematical Reasoning Beyond Accuracy☆32Updated 5 months ago
- An automatic MLLM hallucination detection framework☆17Updated 11 months ago
- [ICML 2024] Language Models Represent Beliefs of Self and Others☆24Updated 2 months ago
- ☆53Updated 5 months ago
- ☆11Updated 2 months ago
- This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness …☆19Updated last year
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆86Updated 3 months ago
- ☆24Updated 7 months ago
- Visual and Embodied Concepts evaluation benchmark☆21Updated 11 months ago
- Code for Findings of EMNLP2023 paper "Coarse-to-Fine Dual Encoders are Better Frame Identification Learners"☆12Updated 11 months ago
- [Arxiv] Calibrated Self-Rewarding Vision Language Models☆35Updated 3 months ago
- ☆12Updated 2 months ago
- ☆53Updated 2 months ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆98Updated 6 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆87Updated 3 weeks ago
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆19Updated last year
- ☆22Updated last month
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆45Updated this week
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆42Updated last year
- ☆22Updated 2 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆33Updated last month
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆22Updated 2 weeks ago
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆32Updated 6 months ago