psunlpgroup / ReaLMistake

This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
26Updated 3 months ago

Related projects

Alternatives and complementary repositories for ReaLMistake