ibm-self-serve-assets / JudgeIt-LLM-as-a-JudgeLinks
Automation Framework using LLM-as-a-judge to evaluate of Agentic AI, RAG, Text2SQL at scale; that is a good proxy for human judgement.
☆33Updated 3 months ago
Alternatives and similar repositories for JudgeIt-LLM-as-a-Judge
Users that are interested in JudgeIt-LLM-as-a-Judge are comparing it to the libraries listed below
Sorting:
- ☆82Updated 2 months ago
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆151Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆125Updated 2 months ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆33Updated last year
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"☆236Updated 3 months ago
- This repository contains a pipeline for fine-tuning Large Language Models (LLMs) for Text-to-SQL conversion using General Reward Proximal…☆42Updated 8 months ago
- Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.☆32Updated last year
- ☆50Updated last year
- ☆147Updated last year
- ☆105Updated last year
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆102Updated 5 months ago
- ☆54Updated last year
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆88Updated 11 months ago
- Lighter, cheaper and faster RAG toolkit (Graph RAG) supported by TargetPilot☆46Updated 7 months ago
- Reward Model framework for LLM RLHF☆62Updated 2 years ago
- DSPY on action with OpenSource LLMs.☆102Updated last year
- ☆63Updated last year
- ☆31Updated last year
- ☆104Updated 9 months ago
- Open Implementations of LLM Analyses☆107Updated last year
- [EMNLP 2024] OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs.☆147Updated last year
- DocLLM: A layout-aware generative language model for multimodal document understanding☆133Updated 2 years ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"☆137Updated 2 years ago
- Mixing Language Models with Self-Verification and Meta-Verification☆111Updated last year
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performance☆92Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 9 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆118Updated 2 months ago
- This is the repo for the LegalBench-RAG Paper: https://arxiv.org/abs/2408.10343.☆149Updated 7 months ago
- Code for the paper, From RAG to QA-RAG: Integrating Generative AI for Pharmaceutical Regulatory Compliance Process☆22Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆50Updated last year