InternScience / SciEvalKitLinks
A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language models across the full research workflow.
☆69Updated this week
Alternatives and similar repositories for SciEvalKit
Users that are interested in SciEvalKit are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆90Updated 6 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆238Updated 6 months ago
- Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows☆147Updated 2 weeks ago
- ☆61Updated last month
- Pixel-Level Reasoning Model trained with RL [NeuIPS25]☆269Updated 2 months ago
- [ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs☆166Updated last month
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆233Updated 2 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆248Updated 3 months ago
- Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"☆123Updated last month
- Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"☆395Updated last week
- A collection of awesome think with videos papers.☆86Updated 2 months ago
- Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆108Updated last month
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆149Updated 3 months ago
- Official code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"☆173Updated 2 weeks ago
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆96Updated 4 months ago
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆277Updated 6 months ago
- Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs☆45Updated 7 months ago
- Agentic MLLMs☆159Updated 3 months ago
- ☆38Updated 6 months ago
- ☆62Updated 2 months ago
- ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning☆111Updated 3 months ago
- [ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology☆73Updated last week
- Code for Retrieval-Augmented Perception (ICML 2025)☆67Updated 5 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆67Updated 11 months ago
- A paper list for spatial reasoning☆631Updated 2 weeks ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆110Updated last year
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆79Updated 2 months ago
- A curated collection of papers, datasets, and resources on Scientific Datasets and Large Language Models (LLMs)☆433Updated 4 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆106Updated last month
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆104Updated 4 months ago