KatherLab / ToolMakerLinks
Turn GitHub repositories into LLM tools. (ACL 2025)
☆54Updated 5 months ago
Alternatives and similar repositories for ToolMaker
Users that are interested in ToolMaker are comparing it to the libraries listed below
Sorting:
- Towards Medical Small Language Models with Self-Evolved \\ Slow Thinking☆82Updated 5 months ago
- MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs☆231Updated 4 months ago
- [EMNLP'24] EHRAgent: Code Empowers Large Language Models for Complex Tabular Reasoning on Electronic Health Records☆111Updated 10 months ago
- ☆34Updated 5 months ago
- A Comprehensive Rare Disease Diagnostic Dataset with nearly 50,000 patients covering more than 4000 diseases☆16Updated 6 months ago
- ☆40Updated 5 months ago
- ☆48Updated 8 months ago
- ☆38Updated 5 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆68Updated 8 months ago
- ☆42Updated last year
- Top papers related to LLM-based agent evaluation☆86Updated 2 weeks ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆106Updated 2 months ago
- This repository contains ScholarQABench data and evaluation pipeline.☆85Updated 2 months ago
- Optimize Any User-defined Compound AI Systems☆61Updated 2 months ago
- Agent benchmark for medical diagnosis☆251Updated 10 months ago
- This is the official repository for HypoGeniC (Hypothesis Generation in Context) and HypoRefine, which are automated, data-driven tools t…☆90Updated last month
- [NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations☆75Updated this week
- OLAPH: Improving Factuality in Biomedical Long-form Question Answering☆37Updated last year
- Official repository of the MIRAGE benchmark☆177Updated last year
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆109Updated last year
- MIRIAD is a million scale Medical Instruction and RetrIeval Datatset☆128Updated 2 months ago
- Medical Hallucination in Foundation Models and Their Impact on Healthcare (2025)☆71Updated 7 months ago
- A collection of AWESOME language modeling techniques on tabular data applications.☆32Updated last year
- Discovering Data-driven Hypotheses in the Wild☆115Updated 4 months ago
- [ACL 2025] Multi-Agent System for Science of Science☆58Updated 3 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29Updated 5 months ago
- ☆33Updated 9 months ago
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆13Updated 2 weeks ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆132Updated last year
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆72Updated 3 months ago