InternScience/SciEvalKit

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/InternScience/SciEvalKit)

InternScience / SciEvalKit

A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language models across the full research workflow.

☆74

Alternatives and similar repositories for SciEvalKit

Users that are interested in SciEvalKit are comparing it to the libraries listed below

Sorting:

OrqueIO / OrqueIO
View on GitHub
Process Orchestration Framework: A camunda 7 fork
☆21Updated this week
shubhamprajapati7748 / sdlc-copilot
View on GitHub
SDLC Copilot is an Agentic AI system designed to streamline and automate the Software Development Lifecycle (SDLC). From requirement gath…
☆23Jun 14, 2025Updated 8 months ago
RUC-NLPIR / EnvScaler
View on GitHub
The official implementation of "EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis".
☆95Feb 12, 2026Updated 2 weeks ago
mmtmn / Darwin-Godel-Machine
View on GitHub
An open-ended, self-improving AI system that evolves its own source code using a local LLM. Built for autonomy, reflection, and code evol…
☆22Jan 24, 2026Updated last month
ayanglab / AIIB
View on GitHub
☆10May 10, 2024Updated last year
danieldickison / kachiclash
View on GitHub
A Grand Sumo prediction game
☆10Updated this week
Qwen-Applications / STAR
View on GitHub
STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models
☆22Feb 12, 2026Updated 2 weeks ago
PRIS-CV / EAFT
View on GitHub
EAFT(Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting) official repo
☆83Jan 15, 2026Updated last month
zh1yu4nyu / CodeIPPrompt
View on GitHub
https://icml.cc/virtual/2023/poster/24354
☆10Aug 15, 2023Updated 2 years ago
VidCapBench / VidCapBench
View on GitHub
☆13May 17, 2025Updated 9 months ago
H0w1 / CISB-dataset
View on GitHub
☆12Apr 22, 2023Updated 2 years ago
SecurityLab-UCD / UniTSyn
View on GitHub
[ISSTA'24] A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing
☆12Jan 7, 2025Updated last year
microsoft / dataflow2text
View on GitHub
Code for "The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation with Dataflow Transduction an…
☆11Apr 30, 2024Updated last year
wangbo15 / accmut
View on GitHub
Accmut is a framework for acclerating mutation testing, which is based on LLVM-IR.
☆10Jan 25, 2018Updated 8 years ago
microsoft / Mnemis
View on GitHub
Official Repostory of "Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory"
☆41Feb 18, 2026Updated last week
serialx / vibecore
View on GitHub
Build your own AI-powered automation tools in the terminal with this extensible agent framework
☆22Jan 5, 2026Updated last month
qzc438 / ontology-llm
View on GitHub
Agent-OM: Leveraging LLM Agents for Ontology Matching
☆18Jan 24, 2026Updated last month
Drup / llvmgraph
View on GitHub
Ocamlgraph overlay for llvm
☆20Apr 4, 2015Updated 10 years ago
wtwofire / A-systematic-review-of-fuzzing-based-on-machine-learning-techniques
View on GitHub
☆10Jul 9, 2020Updated 5 years ago
JWesleySM / Whiro
View on GitHub
☆22Oct 30, 2024Updated last year
sutambe / cpp-generators
View on GitHub
Composable Data and Type Generators for C++
☆10Mar 25, 2019Updated 6 years ago
tukcps / SysMD
View on GitHub
SysMD is a SysML v2/KerML tool. It offers a little entry hurdle by its Notebook-like UI. Unique to SysMD is its integrated solver that do…
☆33Dec 11, 2025Updated 2 months ago
sctg-development / ROCOv2-radiology
View on GitHub
Radiology Object in COntext version 2
☆18Nov 13, 2024Updated last year
h4iku / T5APR
View on GitHub
Repository for the paper "T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemble."
☆11Oct 23, 2025Updated 4 months ago
jonnypei / acl23-preadd
View on GitHub
☆12Jul 25, 2023Updated 2 years ago
yongzhuo / MacroGPT-Pretrain
View on GitHub
macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor
☆15Nov 30, 2023Updated 2 years ago
CodeLLM-Research / CodeJudge-Eval
View on GitHub
[COLING25] CodeJudge Eval: Can Large Language Models be Good Judges in Code Understanding?
☆12Dec 3, 2024Updated last year
miaozhang0525 / iDARTS
View on GitHub
codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients
☆10May 27, 2021Updated 4 years ago
asyncins / OnehotCode
View on GitHub
One-hot Code for deep learnning 用于深度学习的独热码编码与解码
☆12Aug 17, 2019Updated 6 years ago
PrithwishJana / CoTran
View on GitHub
Official repository for CoTran: An LLM-based code translator for whole-program translation, fine-tuned using feedback from compiler and s…
☆16Nov 6, 2024Updated last year
troyhantech / deep-research
View on GitHub
A minimalist deep research framework for any OpenAI API compatible LLMs.
☆14Nov 3, 2025Updated 3 months ago
thepurpleowl / codequeries-benchmark
View on GitHub
Repository of the paper 'CodeQueries: A Dataset of Semantic Queries over Code' published in ISEC 2024
☆13Apr 21, 2024Updated last year
rutgers-apl / fpsanitizer
View on GitHub
A debugger to detect and diagnose numerical errors in floating point programs
☆12Jun 19, 2022Updated 3 years ago
MSRSSP / hyperfuzzer-seeds
View on GitHub
☆11Aug 10, 2021Updated 4 years ago
blockhousetech / guardian
View on GitHub
☆10Feb 24, 2023Updated 3 years ago
jaopaulolc / KernelFaRer
View on GitHub
KernelFaRer: Replacing Native-Code Idioms with High-Performance Library Calls
☆12Sep 7, 2025Updated 5 months ago
CalPolySEC / wrath-ctf-framework
View on GitHub
What? Really? AnoTHer CTF Framework
☆12Oct 21, 2018Updated 7 years ago
FuturePathAI / Learn-AI-Engineering
View on GitHub
Code, notebooks, and other material for FuturePath AI's training course on Generative AI
☆12Apr 24, 2025Updated 10 months ago
obaraelijah / webdev_tutorial
View on GitHub
A tutorial for building Web Services in Rust with Actix-Web, SQLx, and PostgreSQL
☆13Mar 20, 2024Updated last year