InternScience / SGI-BenchLinks
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
☆147Updated 2 weeks ago
Alternatives and similar repositories for SGI-Bench
Users that are interested in SGI-Bench are comparing it to the libraries listed below
Sorting:
- ☆61Updated last month
- A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language m…☆69Updated this week
- Paper list of agent for science☆191Updated 2 weeks ago
- A curated collection of papers, datasets, and resources on Scientific Datasets and Large Language Models (LLMs)☆433Updated 4 months ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆65Updated 3 weeks ago
- Official implementation of X-Master, a general-purpose tool-augmented reasoning agent.☆308Updated 3 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆250Updated 4 months ago
- Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".☆111Updated 2 weeks ago
- A Scientific Multimodal Foundation Model☆629Updated 4 months ago
- [🏆AAAI2025] Official Repo for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.☆68Updated last month
- A collection of resources and papers on AI Scientist / Robot Scientist☆123Updated 4 months ago
- (ACL-2025 main conference) Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback☆38Updated 7 months ago
- Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"☆123Updated last month
- The code and data of We-Math 2.0.☆164Updated 5 months ago
- Survey and paper list on efficiency-guided LLM agents (memory, tool learning, planning).☆154Updated last week
- [ICLR 2026] TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆419Updated last week
- Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization☆349Updated 3 weeks ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆99Updated last week
- This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.☆115Updated 3 months ago
- LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.☆240Updated last month
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆37Updated last year
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆104Updated last week
- [ACL 2025] Multi-Agent System for Science of Science☆65Updated 6 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆147Updated 6 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆67Updated 11 months ago
- Open-source Agentic RL for LLMs — RLAnything & DemyAgent☆223Updated this week
- ☆169Updated 2 months ago
- ☆143Updated 2 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆129Updated 6 months ago
- Official Repository for PosterGen☆209Updated 3 weeks ago