Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]
☆74Jan 13, 2025Updated last year
Alternatives and similar repositories for spiqa
Users that are interested in spiqa are comparing it to the libraries listed below
Sorting:
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆23Dec 21, 2023Updated 2 years ago
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆19Jun 2, 2025Updated 8 months ago
- Detect-Then-Explain Framework for Text-to-SQL task☆10Dec 6, 2023Updated 2 years ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29May 22, 2025Updated 9 months ago
- Official implementation of Panacea: A foundation model for clinical trial design, recruitment, search, and summarization.☆18Dec 24, 2024Updated last year
- ☆11Jan 3, 2024Updated 2 years ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- ☆12Mar 5, 2025Updated 11 months ago
- ☆12Jun 20, 2023Updated 2 years ago
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated last month
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated 11 months ago
- Medea: An omics AI agent for therapeutic discovery☆46Jan 21, 2026Updated last month
- ☆34Jan 25, 2026Updated last month
- SciAssess is a comprehensive benchmark for evaluating Large Language Models' proficiency in scientific literature analysis across various…☆83May 21, 2025Updated 9 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆32Nov 25, 2025Updated 3 months ago
- ☆52Oct 17, 2023Updated 2 years ago
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆12Oct 11, 2024Updated last year
- Radiology Language Evaluations☆11Nov 17, 2023Updated 2 years ago
- ☆12Mar 18, 2024Updated last year
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆34Jan 16, 2026Updated last month
- ☆12Apr 6, 2024Updated last year
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆63Oct 9, 2024Updated last year
- ☆63Jan 3, 2025Updated last year
- Well documented examples of running distributed training jobs on Modal☆21Updated this week
- [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models☆59Jan 22, 2025Updated last year
- ICCV 2025: Official Implematation of "Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced L…☆59Oct 25, 2025Updated 4 months ago
- Agent-based implementation of RAG, incorporating AI agents into the RAG pipeline to orchestrate its components and perform additional act…☆19Feb 20, 2025Updated last year
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)☆29Oct 26, 2025Updated 4 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆18Oct 17, 2025Updated 4 months ago
- Causal Analysis of Agent Behavior for AI Safety☆20Jun 27, 2023Updated 2 years ago
- Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"☆89Nov 4, 2025Updated 3 months ago
- ☆24May 13, 2025Updated 9 months ago
- A Comprehensive Benchmark for Robust Multi-image Understanding☆19Sep 4, 2024Updated last year
- ☆18Oct 28, 2025Updated 4 months ago
- ☆44Jun 21, 2024Updated last year
- [CVPR 2025] BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature☆92Mar 22, 2025Updated 11 months ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated 11 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆53Sep 29, 2025Updated 5 months ago