AI4LIFE-GROUP / med-safety-benchLinks
MedSafetyBench: Evaluating and Improving the Medical Safety of LLMs [NeurIPS 2024]
☆30Updated last week
Alternatives and similar repositories for med-safety-bench
Users that are interested in med-safety-bench are comparing it to the libraries listed below
Sorting:
- (ICML 2023) Discover and Cure: Concept-aware Mitigation of Spurious Correlation☆41Updated last year
- ☆29Updated 5 months ago
- Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records☆23Updated 11 months ago
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆12Updated last year
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆39Updated 7 months ago
- Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning (ICML 2023)☆18Updated last year
- ☆31Updated last year
- [ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation wi…☆39Updated last year
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆21Updated 2 weeks ago
- EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images, NeurIPS 2023 D&B☆81Updated last year
- code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…☆13Updated 8 months ago
- [NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations☆68Updated last week
- Medical multi-modal learning with missing modality data (MLHC 2023)☆13Updated last year
- Code for the paper "ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation" (EMNLP'2…☆18Updated 7 months ago
- EMNLP'22 | PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning☆29Updated 2 years ago
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"☆124Updated last year
- A collection of resources and information for concrete skills that are helpful when pursuing a PhD in computer science (specifically in M…☆23Updated 2 years ago
- Official code repository for Correct-N-Contrast☆22Updated 3 years ago
- DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models (NeurIPS 2024 D&B Track)☆21Updated 4 months ago
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆13Updated last year
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆45Updated 9 months ago
- The code repository for ICML24 paper "Tabular Insights, Visual Impacts: Transferring Expertise from Tables to Images"☆19Updated 4 months ago
- MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning☆47Updated 2 weeks ago
- Lightweight Adapting for Black-Box Large Language Models☆22Updated last year
- source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"☆50Updated 3 months ago
- MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)☆33Updated last year
- ☆28Updated 4 months ago
- ☆48Updated 4 months ago
- Repo for the pape Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions☆38Updated last week
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆42Updated last year