InternScience / SGI-BenchLinks
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
☆147Updated 2 weeks ago
Alternatives and similar repositories for SGI-Bench
Users that are interested in SGI-Bench are comparing it to the libraries listed below
Sorting:
- A curated collection of papers, datasets, and resources on Scientific Datasets and Large Language Models (LLMs)☆433Updated 4 months ago
- [ICLR 2026] TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆419Updated last week
- Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".☆111Updated last week
- ☆61Updated last month
- Survey and paper list on efficiency-guided LLM agents (memory, tool learning, planning).☆122Updated this week
- ☆507Updated last week
- A Scientific Multimodal Foundation Model☆629Updated 4 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆250Updated 4 months ago
- LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.☆236Updated last month
- A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language m…☆69Updated this week
- ☆132Updated 2 months ago
- This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.☆110Updated 3 months ago
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆104Updated this week
- Paper list of agent for science☆191Updated 2 weeks ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆65Updated 3 weeks ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆147Updated 6 months ago
- Official implementation of X-Master, a general-purpose tool-augmented reasoning agent.☆308Updated 3 months ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆99Updated last week
- Demystifying Reinforcement Learning in Agentic Reasoning☆159Updated 3 months ago
- repo for paper https://arxiv.org/abs/2504.13837☆325Updated last month
- Official Repository for PosterGen☆209Updated 3 weeks ago
- [🏆AAAI2025] Official Repo for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.☆68Updated last month
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆129Updated 6 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆349Updated 8 months ago
- A collection of resources and papers on AI Scientist / Robot Scientist☆123Updated 4 months ago
- ☆385Updated 2 months ago
- The code and data of We-Math 2.0.☆164Updated 5 months ago
- PaperBanana: Automating Academic Illustration For AI Scientists☆451Updated this week
- Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆304Updated last week
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆50Updated last month