Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
☆160Jan 19, 2026Updated 2 months ago
Alternatives and similar repositories for SGI-Bench
Users that are interested in SGI-Bench are comparing it to the libraries listed below
Sorting:
- ☆12Oct 24, 2024Updated last year
- graphs from Draw.io☆14Sep 26, 2024Updated last year
- Aerial Detection Toolbox☆11Jan 18, 2023Updated 3 years ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated 3 weeks ago
- ☆35Aug 18, 2025Updated 7 months ago
- Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs☆45Jun 17, 2025Updated 9 months ago
- Official code repository for the paper: AbsPyramid: Benchmarking the Abstration Ability of Language Models with a Unified Entailment Grap…☆13Oct 30, 2024Updated last year
- [NeurIPS 2024] BEACON: Benchmark for Comprehensive RNA Tasks and Language Models☆62Aug 2, 2024Updated last year
- Stable-DiffCoder is a family of lightweight open-source code DLLMs(diffusion large language models) comprising base and instruct models, …☆79Mar 9, 2026Updated last week
- This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents wit…☆58Mar 14, 2026Updated last week
- ☆24Oct 9, 2025Updated 5 months ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆48Oct 16, 2025Updated 5 months ago
- Code for "Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models", ICLR 2024 Oral.☆21Feb 4, 2026Updated last month
- ☆72Mar 3, 2026Updated 2 weeks ago
- ☆139Updated this week
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆24Jul 1, 2025Updated 8 months ago
- ☆16May 26, 2025Updated 9 months ago
- SSRL: Self-Search Reinforcement Learning☆207Aug 20, 2025Updated 7 months ago
- From cryo-EM density map to atomic structure☆25Feb 7, 2026Updated last month
- scMalignantFinder is a Python package specially designed for analyzing cancer single-cell RNA-seq datasets to distinguish malignant cells…☆17Feb 28, 2026Updated 3 weeks ago
- The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight☆88Jan 16, 2026Updated 2 months ago
- ☆16Nov 5, 2024Updated last year
- Codes and Datasets for the Paper: Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extracti…☆15Jun 5, 2024Updated last year
- Official Implementation of "Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts" at EMNLP 202…☆13Oct 27, 2024Updated last year
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings☆11Feb 24, 2025Updated last year
- The first high school physics Olympiad benchmark for evaluating (M)LLMs with step-level grading and human-level comparison.☆25Dec 19, 2025Updated 3 months ago
- a deep generative model for single-cell survival analysis☆16Dec 2, 2025Updated 3 months ago
- [AAAI2025] Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection☆33Jun 26, 2025Updated 8 months ago
- [ACL 2024] Implementation for Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation☆15Oct 9, 2025Updated 5 months ago
- ☆14May 25, 2022Updated 3 years ago
- Codes and data for KDD 2024 Research Track paper "ProCom: A Few-shot Targeted Community Detection Algorithm"☆11Aug 15, 2024Updated last year
- [ICCV 2025] Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?☆23Sep 16, 2025Updated 6 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 11 months ago
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆21Feb 23, 2025Updated last year
- WWW 2024: New Frontiers of Knowledge Graph Reasoning: Recent Advances and Future Trends☆18May 14, 2024Updated last year
- Interpretation of RNAseq experiments through robust, efficient comparison to public databases☆16Oct 31, 2025Updated 4 months ago
- Code and data repository for "The Mirage of Model Editing: Revisiting Evaluation in the Wild"☆16Aug 27, 2025Updated 6 months ago
- Benchmark for Answering Existential First Order Queries with Single Free Variable (NeurIPS dataset and benchmark 2021)☆20May 3, 2023Updated 2 years ago