[EMNLP 2025] A real-world clinical benchmark for medical LLMs with physician validation — 2,996 questions from EHRs
☆28May 21, 2026Updated 3 weeks ago
Alternatives and similar repositories for LLMEval-Med
Users that are interested in LLMEval-Med are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats☆21Apr 15, 2026Updated 2 months ago
- 🚀 [ICLR '25] RocketEval: Efficient Automated LLM Evaluation via Grading Checklist☆16Aug 21, 2025Updated 9 months ago
- [ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling☆17Jun 6, 2024Updated 2 years ago
- This is the 2024 OS lab repository.☆11Jun 27, 2024Updated last year
- [ESWC '24] This repo is official implementation for the paper "Towards Harnessing Large Language Models as Autonomous Agents for Semantic…☆10May 25, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Extract corpora from Wikipedia dumps☆26Mar 26, 2019Updated 7 years ago
- [ACL'26 Findings] Official code for "BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search"☆29Apr 23, 2026Updated last month
- The code repository of paper "TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities"☆20May 12, 2026Updated last month
- ☆206Apr 14, 2026Updated 2 months ago
- The official implementation of the paper "Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models" (NeurIPS 2025 Pos…☆75Sep 29, 2025Updated 8 months ago
- 2024年北航os课程仓库,不同分支包含不同lab的代码,以及笔记、思考题作业等☆29Jul 13, 2024Updated last year
- DoctorRAG is a medical AI that mimics doctor-like reasoning by combining textbook knowledge with insights from similar patient cases, usi…☆22May 21, 2025Updated last year
- D.Com 학우들을 위한 커리어 조언 Repo☆12May 17, 2023Updated 3 years ago
- Repository for the research work "Ontology Generation using Large Language Models", presented at ESWC 2025.☆36Aug 15, 2025Updated 10 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆24Sep 1, 2025Updated 9 months ago
- Code and data for paper named: Large language models for automatic equation discovery of nonlinear dynamics☆13Mar 6, 2025Updated last year
- About Code release for "FlashBias: Fast Computation of Attention with Bias" (NeurIPS 2025), https://arxiv.org/abs/2505.12044☆29Nov 17, 2025Updated 6 months ago
- An open-source alternative to v0.dev. Cost-effective, highly customizable, and seamlessly integrated within GitHub.☆33Jan 24, 2024Updated 2 years ago
- 这是一个票据自动识别处理的仓库,希望对有类似业务需求的同学有借鉴意义☆39Apr 14, 2023Updated 3 years ago
- Using Seq2Seq transformers for Text2SQL task on WikiSQL dataset.☆12Jan 8, 2022Updated 4 years ago
- ☆17Dec 31, 2023Updated 2 years ago
- Papers and codes of Physics-informed Deep Compositional Operator Network☆13Oct 31, 2025Updated 7 months ago
- 北航场馆预约系统Python+Selenium自动化脚本☆34Apr 27, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Latent Knowledge-Guided Video Diffusion for Scientific Phenomena Generation from a Single Initial Frame☆17May 2, 2026Updated last month
- MDRDC dataset and used baselines☆11Feb 20, 2023Updated 3 years ago
- We present cod-bench containing 12 operators and 10 datasets.☆11Jun 5, 2024Updated 2 years ago
- ☆14Feb 14, 2024Updated 2 years ago
- Code for "Holistic Physics Solver: Learning PDEs in a Unified Spectral-Physical Space"☆24Mar 25, 2026Updated 2 months ago
- ☆15Jul 18, 2025Updated 10 months ago
- This repository contains the code for the paper: Deciphering and integrating invariants for neural operator learning with various physica…☆13Mar 18, 2024Updated 2 years ago
- [ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines☆39May 21, 2026Updated 3 weeks ago
- ☆14Aug 22, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆15Mar 6, 2024Updated 2 years ago
- Basic setup and easy to follow templates to interact and search CogStack for data analysts☆12Sep 18, 2025Updated 8 months ago
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆154Jun 1, 2026Updated 2 weeks ago
- my own project☆37Jul 15, 2024Updated last year
- ☆24Aug 14, 2025Updated 10 months ago
- BUAA OS Lab "MOS" Open Source Repository | 北航操作系统课程 MOS 内核实验开源代码仓库☆65May 2, 2025Updated last year
- [Nature Communications, 2026] The official code for "Boosting Pathology Foundation Models via Few-shot Prompt-tuning for Rare Cancer Subt…☆27Apr 14, 2026Updated 2 months ago