UCSC-VLAA/ReasoningEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UCSC-VLAA/ReasoningEval)

UCSC-VLAA / ReasoningEval

Official repo of Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains.

☆43

Alternatives and similar repositories for ReasoningEval

Users that are interested in ReasoningEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

haojinw0027 / MedFrameQA
View on GitHub
MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning
☆18Jun 6, 2025Updated last year
UCSC-VLAA / EarthWhere
View on GitHub
☆16Nov 15, 2025Updated 8 months ago
UCSC-VLAA / VLAA-Thinking
View on GitHub
[TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
☆148Oct 10, 2025Updated 9 months ago
UCSC-VLAA / ClinSeekAgent
View on GitHub
☆30Jun 1, 2026Updated last month
UCSC-VLAA / CLIPS
View on GitHub
An Enhanced CLIP Framework for Learning with Synthetic Captions
☆40Apr 18, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
UCSC-VLAA / m1
View on GitHub
[ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models
☆51Dec 21, 2025Updated 7 months ago
UCSC-VLAA / EpiFoundation
View on GitHub
Pytorch implementation of EpiFoundation
☆26Feb 25, 2025Updated last year
MAGIC-AI4Med / RadABench
View on GitHub
The official codes for "Can Modern LLMs Act as Agent Cores in Radiology Environments?"
☆29Jan 22, 2025Updated last year
UCSC-VLAA / STAR-1
View on GitHub
[AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
☆38Apr 7, 2025Updated last year
UCSC-VLAA / MedReason
View on GitHub
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
☆280Jun 19, 2025Updated last year
BlueZeros / AgentEHR
View on GitHub
Agentic System, Tool Use, Electronic Health Record, Large Language Models, Clinical Nature Language Processing
☆24Apr 13, 2026Updated 3 months ago
UCSC-VLAA / AttnGCG-attack
View on GitHub
[TMLR 2025] Official implementation of AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
☆27Jun 17, 2025Updated last year
OliverRensu / MVG
View on GitHub
☆61Jun 18, 2024Updated 2 years ago
eth-lre / LLM_ICL
View on GitHub
ACL24
☆11Jun 7, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
MAGIC-AI4Med / ChestX-Reasoner
View on GitHub
☆39Mar 19, 2026Updated 4 months ago
rajpurkarlab / CXR-ReDonE
View on GitHub
Official PyTorch implementation of https://arxiv.org/abs/2210.06340 (NeurIPS ‘22)
☆21Nov 14, 2022Updated 3 years ago
Guitaricet / my_pefty_llama
View on GitHub
Minimal implementation of multiple PEFT methods for LLaMA fine-tuning
☆13May 7, 2023Updated 3 years ago
shengliu66 / Cerebra
View on GitHub
Official implementation of Cererba
☆22Jul 4, 2026Updated 3 weeks ago
maximek3 / MIMIC-NLE
View on GitHub
☆21Jul 25, 2022Updated 4 years ago
OliverRensu / SDMP
View on GitHub
☆19Jan 2, 2023Updated 3 years ago
ASGMVLP / ASGMVLP_CODE
View on GitHub
The repo of ASGMVLP
☆19Jan 16, 2026Updated 6 months ago
iheallab / apricotM
View on GitHub
This repository contains the official code for the paper "Real-time prediction of intensive care unit patient acuity and therapy requirem…
☆21Jul 22, 2025Updated last year
microsoft / x-reasoner
View on GitHub
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
☆49Feb 4, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
UCSC-VLAA / MedVLSynther
View on GitHub
[ICLR'26] MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs
☆19Nov 1, 2025Updated 8 months ago
bairdzhang / des
View on GitHub
☆19Mar 27, 2018Updated 8 years ago
MAGIC-AI4Med / MedRBench
View on GitHub
[Nature Communications] The official code for "Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases".
☆70Nov 7, 2025Updated 8 months ago
AutoMedBench / AutoMedBench
View on GitHub
MedAutoBench — Medical AutoResearch Benchmark for Autonomous AI Agents
☆55Jul 9, 2026Updated 2 weeks ago
UCSC-VLAA / MeDiM
View on GitHub
☆32Dec 1, 2025Updated 7 months ago
ncbi-nlp / Clinical-Tool-Learning
View on GitHub
☆27Aug 10, 2025Updated 11 months ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
rajpurkarlab / ReXrank
View on GitHub
☆28Updated this week
MAGIC-AI4Med / DiagGym
View on GitHub
A virtual clinical environment for self‑evolving LLM diagnostic agents.
☆108Feb 12, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
UCSC-VLAA / Sight-Beyond-Text
View on GitHub
[TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
☆20Sep 15, 2023Updated 2 years ago
MAGIC-AI4Med / MedS-Ins
View on GitHub
[npj digital medicine] The official codes for "Towards Evaluating and Building Versatile Large Language Models for Medicine"
☆79May 5, 2025Updated last year
MIC-DKFZ / anatomy_informed_DA
View on GitHub
☆21Nov 28, 2023Updated 2 years ago
wjhou / Recap
View on GitHub
[EMNLP 2023 Findings] RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning
☆28Jun 12, 2025Updated last year
UCSC-VLAA / MedTrinity-25M
View on GitHub
[ICLR 2025] This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations…
☆413Jul 11, 2025Updated last year
UCSC-VLAA / MicroDiffusion
View on GitHub
[CVPR 2024] This repository includes the official implementation our paper "MicroDiffusion: Implicit Representation-Guided Diffusion for …
☆55May 13, 2024Updated 2 years ago
wangf3014 / Adventurer
View on GitHub
☆29Feb 27, 2025Updated last year