mims-harvard / CUREBenchLinks
CUREBench @ NeurIPS 2025: Benchmarking AI reasoning for therapeutic decision-making at scale
☆122Updated 2 weeks ago
Alternatives and similar repositories for CUREBench
Users that are interested in CUREBench are comparing it to the libraries listed below
Sorting:
- ICLR'24 | BioBridge: Bridging Biomedical Foundation Models via Knowledge Graphs☆76Updated last year
- MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning☆67Updated 2 months ago
- A curated list of LLM powered AI Agents in Biomedical Research. Medical Image Analysis, Multi-omics Genomics Analysis, Biomedical Scienti…☆69Updated 2 months ago
- [ICML'25] MedTok: Multimodal Medical Code Tokenizer☆32Updated 5 months ago
- Democratizing AI scientists with ToolUniverse☆740Updated last week
- A specialized LLM for study search, study screening, and data extraction from medical literature.☆24Updated 9 months ago
- BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science☆24Updated last year
- BioDiscoveryAgent is an LLM-based AI agent for closed-loop design of genetic perturbation experiments☆92Updated 5 months ago
- [NeurIPS 2023] Official codes of "MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data"☆30Updated 5 months ago
- [COLM 2024] Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation☆14Updated last year
- Code for CTO: A Large Clinical Trial Outcome and QA Dataset☆28Updated 2 weeks ago
- PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes [EMNLP 2024]☆28Updated last year
- Code and data for Cell-o1.☆26Updated 3 months ago
- 🔥🔥🔥 Latest Papers, Codes and Datasets on Large Biology Models!☆24Updated 2 years ago
- [ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆47Updated 8 months ago
- ☆48Updated 9 months ago
- [NeurIPS 2024] BEACON: Benchmark for Comprehensive RNA Tasks and Language Models☆56Updated last year
- [EMNLP2024] Benchmark for "Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark"☆35Updated 3 months ago
- Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through M…☆238Updated 2 weeks ago
- ☆43Updated last year
- Paper list of agent for science☆173Updated last week
- A toolkit for developing foundation models using Electronic Health Record (EHR) data.☆46Updated this week
- Must-read papers on AI for Biology☆23Updated 2 years ago
- ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning☆103Updated last month
- [COLM'24] We propose Protein Chain of Thought (ProCoT), which replicates the biological mechanism of signaling pathways as language promp…☆70Updated 3 weeks ago
- ☆36Updated 10 months ago
- [KDD2024 ADS Track] RareBench: Can LLMs Serve as Rare Diseases Specialists?☆28Updated 3 weeks ago
- ☆21Updated 9 months ago
- KGARevion: AI Agent for Knowledge-Intensive Biomedical QA☆40Updated 9 months ago
- [ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models☆289Updated last year