[AAAI 2024] SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research
☆30Aug 6, 2024Updated last year
Alternatives and similar repositories for SciEval
Users that are interested in SciEval are comparing it to the libraries listed below
Sorting:
- ☆130Jul 8, 2024Updated last year
- ☆10Dec 20, 2023Updated 2 years ago
- [WWW 25] USPTO-LLM: A Large Language Model-Assisted Information-enriched Chemical Reaction Dataset☆16Dec 12, 2024Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆23Mar 4, 2025Updated last year
- ☆14Apr 16, 2024Updated last year
- [NeurIPS 24] Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation☆18Jan 2, 2026Updated 2 months ago
- ☆10Apr 20, 2022Updated 3 years ago
- ☆15Dec 4, 2023Updated 2 years ago
- ☆16Jan 5, 2021Updated 5 years ago
- ☆21Feb 3, 2026Updated last month
- A quantitative benchmark and analysis of molecular large language models.☆18Jun 3, 2025Updated 9 months ago
- Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design☆20Jul 26, 2023Updated 2 years ago
- Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"☆62Nov 18, 2025Updated 3 months ago
- Drug-target binding affinity prediction using representation learning, graph mining, and machine learning☆25Mar 21, 2022Updated 3 years ago
- ☆40Jun 20, 2025Updated 8 months ago
- ☆31Jun 12, 2024Updated last year
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆28May 28, 2024Updated last year
- PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes [EMNLP 2024]☆28Nov 18, 2024Updated last year
- ICLR 2025 paper: 3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery☆27Apr 25, 2025Updated 10 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- ☆34Dec 2, 2025Updated 3 months ago
- ☆32May 10, 2025Updated 9 months ago
- ☆28Aug 20, 2022Updated 3 years ago
- [NeurIPS 2023] "Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules"☆40Mar 16, 2024Updated last year
- ☆41Mar 26, 2025Updated 11 months ago
- [SCIS] MULTI-Benchmark: Multimodal Understanding Leaderboard with Text and Images☆44Nov 19, 2025Updated 3 months ago
- ☆11Oct 28, 2021Updated 4 years ago
- ☆11Sep 24, 2024Updated last year
- [ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"☆10May 5, 2024Updated last year
- ☆43Dec 1, 2025Updated 3 months ago
- ☆11Dec 22, 2024Updated last year
- ☆12Jan 11, 2026Updated last month
- ☆16Dec 2, 2025Updated 3 months ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- Bayesian inference of conformational populations☆13Jun 11, 2025Updated 8 months ago
- Streamlit web application to deploy a machine learning binary classifier to predict the activity of antimicrobial peptides☆10Dec 13, 2022Updated 3 years ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- Discriminator for Model Docking☆11Dec 20, 2024Updated last year