microsoft / healthcare-ai-model-evaluatorLinks
Healthcare AI Model Evaluator (HAIME) empowers healthcare organizations to independently evaluate and customize AI solutions, addressing challenges of transparency, clinical relevance, and real-world impact. By putting control in the hands of clinical professionals, it enables confident, context-specific adoption of AI in healthcare.
☆36Updated this week
Alternatives and similar repositories for healthcare-ai-model-evaluator
Users that are interested in healthcare-ai-model-evaluator are comparing it to the libraries listed below
Sorting:
- MedAlign is a clinician-generated dataset for instruction following with electronic medical records.☆97Updated 8 months ago
- MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents☆212Updated 2 months ago
- MIRIAD is a million-scale Medical Instruction and Retrieval Datatset☆142Updated 2 months ago
- A comprehensive repository of reasoning tasks for Medical LLMs (and beyond)☆132Updated last year
- MedEmbed is a collection of embedding models fine-tuned specifically for medical and clinical data.☆84Updated 2 months ago
- ☆166Updated 2 months ago
- A Python Natural Language Processing Toolkit for Medical Text Generation☆84Updated 8 months ago
- Code and data for TrialGPT.☆140Updated last year
- For Med-Gemini, we relabeled the MedQA benchmark; this repo includes the annotations and analysis code.☆66Updated last year
- ☆44Updated 2 years ago
- Expert-Curated Oncology Reports to Advance Language Model Inference☆31Updated last year
- A private and secure generative AI tool, based on GPT and o models, deployed for non-clinical use at Dana-Farber Cancer Institute☆64Updated last year
- LLM Embeddings for ICD 10 Data☆63Updated last year
- Clinical text summarization by adapting large language models☆155Updated last year
- Almanac: Retrieval-Augmented Language Models for Clinical Medicine☆38Updated last year
- Anonymize Medical Documents using LLMs☆59Updated last year
- MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs☆250Updated 7 months ago
- Fact Verification for Clinical Notes with LLMs☆36Updated last month
- ☆40Updated last week
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆174Updated last week
- A benchmark for few-shot evaluation of foundation models for electronic health records (EHRs)☆209Updated 7 months ago
- Verifying Facts in LLM-Generated Clinical Text with Electronic Health Records☆23Updated 3 weeks ago
- NeurIPS'24 DB (Spotlight) | Instruction Tuning Large Language Models to Understand Electronic Health Records☆56Updated 4 months ago
- OLAPH: Improving Factuality in Biomedical Long-form Question Answering☆37Updated last year
- Basic entity linker for the SNOMED EL Challenge☆13Updated 2 years ago
- ☆117Updated 2 years ago
- Official page for ICLR 2025 paper "Sufficient Context: A New Lens on Retrieval Augmented Generation Systems"☆63Updated 6 months ago
- build and benchmark deep research☆211Updated this week
- Agent benchmark for medical diagnosis☆274Updated last year
- public code repository for paper "Health system scale language models are general purpose clinical prediction engines"☆123Updated 2 years ago