OS-Copilot/ScienceBoard

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OS-Copilot/ScienceBoard)

OS-Copilot / ScienceBoard

[ICLR 2026] Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"

☆131

Alternatives and similar repositories for ScienceBoard

Users that are interested in ScienceBoard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InternLM / JanusCoder
View on GitHub
[ICLR 2026] JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
☆78May 9, 2026Updated 2 months ago
OS-Copilot / OS-Sentinel
View on GitHub
[ACL 2026] Code, benchmark and environment for "OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic…
☆49Jul 5, 2026Updated 2 weeks ago
zzli2022 / TLDR
View on GitHub
Code for Research Project TLDR
☆26Jul 28, 2025Updated 11 months ago
wjn1996 / Chain-of-Knowledge
View on GitHub
☆24Jun 13, 2023Updated 3 years ago
chengyou-jia / T2IS
View on GitHub
Official Repo for "Why Settle for One? Text-to-ImageSet Generation and Evaluation"
☆21Oct 1, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
yayayacc / MUR
View on GitHub
☆49May 14, 2026Updated 2 months ago
OS-Copilot / OS-Genesis
View on GitHub
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆188Oct 8, 2025Updated 9 months ago
njucckevin / CapArena
View on GitHub
An Arena-style Automated Evaluation Benchmark for Detailed Captioning
☆59Jun 1, 2025Updated last year
yayayacc / TIDE
View on GitHub
☆18Feb 4, 2026Updated 5 months ago
hkust-nlp / GUIMid
View on GitHub
☆22May 3, 2025Updated last year
chengyou-jia / AgentStore
View on GitHub
[ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
☆46Dec 19, 2024Updated last year
xcltql666 / DenseDiT
View on GitHub
Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"
☆27Jun 7, 2026Updated last month
hanxuhu / SeqIns
View on GitHub
The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…
☆30Nov 24, 2024Updated last year
DaSE4Good / EfficientTools
View on GitHub
☆13Jun 10, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
xufangzhi / Odyssey-Arena
View on GitHub
Extremely Long-Horizon Agentic Tasks Requiring Active Acting and Inductive Reasoning
☆33Feb 9, 2026Updated 5 months ago
chang-github-00 / LLM-Predictive-Decoding
View on GitHub
☆16Jul 9, 2025Updated last year
Yingjia-Wan / FaStfact
View on GitHub
Code repo for FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs.
☆33Nov 5, 2025Updated 8 months ago
HKUNLP / subgoal-theorem-prover
View on GitHub
Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"
☆20May 25, 2023Updated 3 years ago
X-GenGroup / PaCo-RL
View on GitHub
Official Implementation for *PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling*
☆42Dec 13, 2025Updated 7 months ago
njucckevin / OpenMobile-Code
View on GitHub
The model, data and code for OpenMobile
☆49Jul 9, 2026Updated last week
xufangzhi / Genius
View on GitHub
[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework
☆72Jun 1, 2025Updated last year
scaleapi / researchrubrics
View on GitHub
Code repository for ICLR 2026 paper "ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents" (https://ww…
☆27Feb 10, 2026Updated 5 months ago
menik1126 / ParallelComp
View on GitHub
[ICML 2025🔥] ParallelComp: Parallel Long-Context Compressor for Length Extrapolation
☆30Jun 16, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Timothyxxx / KVCachePapers
View on GitHub
☆20May 24, 2024Updated 2 years ago
wjn1996 / KP-PLM
View on GitHub
（Accepted By EMNLP2022 main long）Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding
☆15Oct 29, 2022Updated 3 years ago
LARK-AI-Lab / CodeScaler
View on GitHub
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆35Mar 26, 2026Updated 3 months ago
NJUNLP / QAlign
View on GitHub
☆39Jan 23, 2024Updated 2 years ago
yjywdzh / ACE
View on GitHub
This repository refers to the codes of paper ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
☆15Jan 31, 2026Updated 5 months ago
QiushiSun / Corex
View on GitHub
[COLM'24] Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
☆30Oct 18, 2024Updated last year
xufangzhi / phi-Decoding
View on GitHub
[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling
☆107May 18, 2025Updated last year
HKUNLP / RSA
View on GitHub
Retrieved Sequence Augmentation for Protein Representation Learning
☆52Nov 1, 2023Updated 2 years ago
HKUNLP / SymGen
View on GitHub
[EMNLP'23] Code for Generating Data for Symbolic Language with Large Language Models
☆18Oct 21, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
CONE-MT / LLaMAX
View on GitHub
☆75Dec 6, 2024Updated last year
HKUNLP / ProGen
View on GitHub
[EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.
☆27Feb 4, 2023Updated 3 years ago
xlang-ai / OSWorld-G
View on GitHub
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
☆172Jun 18, 2026Updated last month
QiushiSun / DaSE-Cloud-Computing-2020
View on GitHub
2020-2021 Fall (Cloud Computing and Development) 云计算应用与开发课程笔记及项目
☆24Feb 19, 2021Updated 5 years ago
OS-Copilot / OS-Symphony
View on GitHub
[ACL 2026 Main] Official repository for paper: OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agents
☆47Apr 7, 2026Updated 3 months ago
QiushiSun / ECNU-Undergraduate-Thesis-Template-2022
View on GitHub
ECNU Undergraduate Thesis Template (Class of 2022)
☆26Apr 22, 2022Updated 4 years ago
OS-Copilot / OS-Atlas
View on GitHub
OS-ATLAS: A Foundation Action Model For Generalist GUI Agents
☆452Apr 20, 2025Updated last year