Leezekun / MMSci
MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension
☆44Updated 5 months ago
Alternatives and similar repositories for MMSci:
Users that are interested in MMSci are comparing it to the libraries listed below
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆56Updated 6 months ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆57Updated 3 months ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆42Updated last week
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆25Updated last month
- ☆99Updated last year
- ☆40Updated this week
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆28Updated 10 months ago
- Official Code of IdealGPT☆36Updated last year
- Preference Learning for LLaVA☆44Updated 6 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆65Updated 11 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆93Updated this week
- The code and data for the paper JiuZhang3.0☆43Updated 11 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 6 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆49Updated 6 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆57Updated 3 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆69Updated 5 months ago
- Exploration of automated dataset selection approaches at large scales.☆40Updated 2 months ago
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆111Updated 2 weeks ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆25Updated 11 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆58Updated last year
- MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale☆42Updated 5 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆69Updated last month
- ☆37Updated 3 weeks ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆57Updated 5 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆47Updated 4 months ago
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆39Updated 5 months ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆25Updated last month
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆100Updated 2 months ago
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆22Updated last month