aws-samples / evaluating-large-language-models-using-llm-as-a-judgeLinks
β20Updated 11 months ago
Alternatives and similar repositories for evaluating-large-language-models-using-llm-as-a-judge
Users that are interested in evaluating-large-language-models-using-llm-as-a-judge are comparing it to the libraries listed below
Sorting:
- β50Updated 6 months ago
- A simple Streamlit application to visualize document chunks and queries in embedding space πΊοΈπβ13Updated 7 months ago
- β43Updated last year
- Writing Blog Posts with Generative Feedback Loops!β50Updated last year
- β51Updated last year
- β21Updated last year
- β56Updated 5 months ago
- Dynamic Metadata based RAG Frameworkβ78Updated this week
- β53Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β51Updated last year
- β80Updated last year
- β81Updated last month
- β148Updated last year
- β23Updated 2 weeks ago
- β24Updated last year
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop