haizelabs / verdictLinks
Inference-time scaling for LLMs-as-a-judge.
☆320Updated 2 months ago
Alternatives and similar repositories for verdict
Users that are interested in verdict are comparing it to the libraries listed below
Sorting:
- ⚖️ Awesome LLM Judges ⚖️☆148Updated 8 months ago
- A framework for optimizing DSPy programs with RL☆303Updated this week
- Red-Teaming Language Models with DSPy☆249Updated 10 months ago
- ☆136Updated 9 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆454Updated last year
- Curated collection of community environments☆196Updated 2 weeks ago
- Sphynx Hallucination Induction☆53Updated 11 months ago
- A small library of LLM judges☆311Updated 5 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆302Updated 3 weeks ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.