aws-samples / evaluating-large-language-models-using-llm-as-a-judgeLinks
☆19Updated 7 months ago
Alternatives and similar repositories for evaluating-large-language-models-using-llm-as-a-judge
Users that are interested in evaluating-large-language-models-using-llm-as-a-judge are comparing it to the libraries listed below
Sorting:
- Codebase accompanying the Summary of a Haystack paper.☆79Updated 11 months ago
- ☆41Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated last year
- A method for steering llms to better follow instructions☆49Updated 3 weeks ago
- ☆49Updated 11 months ago
- ☆80Updated last year
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆32Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆109Updated 2 months ago
- ☆20Updated 10 months ago
- ☆40Updated 8 months ago
- Streamlit app for recommending eval functions using prompt diffs☆29Updated last year
- Dynamic Metadata based RAG Framework☆75Updated last year
- ☆24Updated 8 months ago
- Writing Blog Posts with Generative Feedback Loops!☆50Updated last year
- A simple Streamlit application to visualize document chunks and queries in embedding space 🗺️🔍☆13Updated 4 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆46Updated last year
- A curated list of materials on AI guardails☆40Updated 2 months ago
- This repository contains a pipeline for fine-tuning Large Language Models (LLMs) for Text-to-SQL conversion using General Reward Proximal…☆33Updated 4 months ago
- ☆145Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 4 months ago
- ☆56Updated 2 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 10 months ago
- ☆77Updated 7 months ago
- ☆48Updated last year
- ☆94Updated 5 months ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24Updated 3 years ago
- ☆50Updated 3 months ago
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context rec…☆34Updated last year
- ☆14Updated last year
- Creating Generative AI Apps which work☆17Updated 4 months ago