[EMNLP 2025] A real-world clinical benchmark for medical LLMs with physician validation — 2,996 questions from EHRs
☆27May 12, 2026Updated 2 weeks ago
Alternatives and similar repositories for LLMEval-Med
Users that are interested in LLMEval-Med are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats☆19Apr 15, 2026Updated last month
- 🚀 [ICLR '25] RocketEval: Efficient Automated LLM Evaluation via Grading Checklist☆16Aug 21, 2025Updated 9 months ago
- [ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling☆17Jun 6, 2024Updated last year
- This is the 2024 OS lab repository.☆11Jun 27, 2024Updated last year
- [ESWC '24] This repo is official implementation for the paper "Towards Harnessing Large Language Models as Autonomous Agents for Semantic…☆10May 25, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆191Apr 14, 2026Updated last month
- Extract corpora from Wikipedia dumps☆26Mar 26, 2019Updated 7 years ago
- [ACL'26 Findings] Official code for "BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search"☆29Apr 23, 2026Updated last month
- The code repository of paper "TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities"☆20May 12, 2026Updated 2 weeks ago
- The official implementation of the paper "Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models" (NeurIPS 2025 Pos…☆73Sep 29, 2025Updated 7 months ago
- 2024年北航os课程仓库,不同分支包含不同lab的代码,以及笔记、思考题作业等☆29Jul 13, 2024Updated last year
- DoctorRAG is a medical AI that mimics doctor-like reasoning by combining textbook knowledge with insights from similar patient cases, usi…☆22May 21, 2025Updated last year
- D.Com 학우들을 위한 커리어 조언 Repo☆12May 17, 2023Updated 3 years ago
- Repository for the research work "Ontology Generation using Large Language Models", presented at ESWC 2025.☆36Aug 15, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆24Sep 1, 2025Updated 8 months ago
- Code and data for paper named: Large language models for automatic equation discovery of nonlinear dynamics☆13Mar 6, 2025Updated last year
- About Code release for "FlashBias: Fast Computation of Attention with Bias" (NeurIPS 2025), https://arxiv.org/abs/2505.12044☆28Nov 17, 2025Updated 6 months ago
- An open-source alternative to v0.dev. Cost-effective, highly customizable, and seamlessly integrated within GitHub.☆33Jan 24, 2024Updated 2 years ago
- 这是一个票据自动识别处理的仓库,希望对有类似业务需求的同 学有借鉴意义☆39Apr 14, 2023Updated 3 years ago
- Using Seq2Seq transformers for Text2SQL task on WikiSQL dataset.☆12Jan 8, 2022Updated 4 years ago
- ☆17Dec 31, 2023Updated 2 years ago
- Papers and codes of Physics-informed Deep Compositional Operator Network☆13Oct 31, 2025Updated 6 months ago
- 北航场馆预约系统Python+Selenium自动化脚本☆35Apr 27, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Latent Knowledge-Guided Video Diffusion for Scientific Phenomena Generation from a Single Initial Frame☆17May 2, 2026Updated 3 weeks ago
- MDRDC dataset and used baselines☆12Feb 20, 2023Updated 3 years ago
- We present cod-bench containing 12 operators and 10 datasets.☆11Jun 5, 2024Updated last year
- ☆14Feb 14, 2024Updated 2 years ago
- Code for "Holistic Physics Solver: Learning PDEs in a Unified Spectral-Physical Space"☆24Mar 25, 2026Updated 2 months ago
- ☆15Jul 18, 2025Updated 10 months ago
- This repository contains the code for the paper: Deciphering and integrating invariants for neural operator learning with various physica…☆13Mar 18, 2024Updated 2 years ago
- [ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines☆37May 12, 2026Updated 2 weeks ago
- ☆14Aug 22, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆15Mar 6, 2024Updated 2 years ago
- Basic setup and easy to follow templates to interact and search CogStack for data analysts☆12Sep 18, 2025Updated 8 months ago
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆153May 13, 2026Updated last week
- my own project☆37Jul 15, 2024Updated last year
- ☆22Aug 14, 2025Updated 9 months ago
- BUAA OS Lab "MOS" Open Source Repository | 北航操作系统课程 MOS 内核实验开源代码仓库☆64May 2, 2025Updated last year
- [Nature Communications, 2026] The official code for "Boosting Pathology Foundation Models via Few-shot Prompt-tuning for Rare Cancer Subt…☆24Apr 14, 2026Updated last month