A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language models across the full research workflow.
☆74Feb 27, 2026Updated 3 weeks ago
Alternatives and similar repositories for SciEvalKit
Users that are interested in SciEvalKit are comparing it to the libraries listed below
Sorting:
- The first high school physics Olympiad benchmark for evaluating (M)LLMs with step-level grading and human-level comparison.☆25Dec 19, 2025Updated 3 months ago
- Medical Visual Question Answering via Conditional Reasoning [ACM MM 2020]☆63Aug 20, 2021Updated 4 years ago
- ☆13Jan 14, 2026Updated 2 months ago
- [ACM MM23] Pytorch implementation for paper: SUG: Single-dataset Unified Generalization for 3D Point Cloud Classification☆12Jul 4, 2023Updated 2 years ago
- An open-ended, self-improving AI system that evolves its own source code using a local LLM. Built for autonomy, reflection, and code evol…☆22Jan 24, 2026Updated last month
- STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models☆40Updated this week
- Code used in the paper "CheXseg: Combining Expert Annotations with DNN-generated Saliency Maps for X-ray Segmentation"☆14Oct 14, 2021Updated 4 years ago
- Official repository for CoTran: An LLM-based code translator for whole-program translation, fine-tuned using feedback from compiler and s…☆16Nov 6, 2024Updated last year
- [ACL 2025] Multi-Agent System for Science of Science☆66Jul 26, 2025Updated 7 months ago
- Official code repo for paper: ACROSS: An Alignment-based Framework for Low-Resource Many-to-One Cross-Lingual Summarization☆12Jul 15, 2023Updated 2 years ago
- [ICCV2025] Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning☆23Nov 13, 2025Updated 4 months ago
- ☆10May 10, 2024Updated last year
- [Sci. Rep. 2025] Revisiting model scaling with a U-net benchmark for 3D medical image segmentation☆18Aug 21, 2025Updated 7 months ago
- A curated collection of papers, datasets, and resources on Scientific Datasets and Large Language Models (LLMs)☆442Oct 3, 2025Updated 5 months ago
- 本项目设计一个可以产生21种音阶的电子琴,由PS2键盘完成输入,在Basys2板识别处理后,产生特定频率声音,最后通过Pmod_AMP模块发出。☆10Jul 21, 2019Updated 6 years ago
- ☆12Oct 24, 2024Updated last year
- ☆19Jul 22, 2025Updated 8 months ago
- Topological Sculpting of 3D Fine-grained Tubular Shapes☆21Nov 25, 2025Updated 3 months ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆52Jan 23, 2026Updated last month
- A collection of state-of-the-art single image super resolution methods.☆13Apr 26, 2021Updated 4 years ago
- ☆17Feb 18, 2026Updated last month
- Radiology Object in COntext version 2☆19Nov 13, 2024Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- ☆21Dec 22, 2024Updated last year
- 一个mmcv 的logger hook, 可以用来把模型结果推送到微信上☆21Oct 11, 2022Updated 3 years ago
- This repository provides a comprehensive library for parallel training and LoRA algorithm implementations, supporting multiple parallel s…☆57Jan 6, 2026Updated 2 months ago
- Deep learning on SAR images☆15May 3, 2015Updated 10 years ago
- [NeurIPS 2024] FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation☆15Mar 4, 2025Updated last year
- ☆25Jan 19, 2026Updated 2 months ago
- [In Progressing]HaN5K: A project to develop foundation models for structure delineation in head and neck radiotherapy based on more than …☆16Dec 25, 2023Updated 2 years ago
- CUREBench @ NeurIPS 2025: Benchmarking AI reasoning for therapeutic decision-making at scale☆129Dec 6, 2025Updated 3 months ago
- ☆15Mar 11, 2023Updated 3 years ago
- Office implementation of "3D-Aware Hypothesis & Verification for Generalizable Relative Object Pose Estimation", ICLR 2024☆12Nov 5, 2024Updated last year
- MC-CoT implementation code☆22Jun 24, 2025Updated 8 months ago
- [ACM MM 2023] Mask-Guided Progressive Network for Joint Raindrop and Rain Streak Removal in Videos☆18Jul 22, 2024Updated last year
- ☆13May 17, 2025Updated 10 months ago
- EAFT(Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting) official repo☆93Jan 15, 2026Updated 2 months ago
- Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals (TMLR 2024)☆18Nov 27, 2024Updated last year
- codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients☆10May 27, 2021Updated 4 years ago