ibm-self-serve-assets / JudgeIt-LLM-as-a-JudgeLinks
Automation Framework using LLM-as-a-judge to evaluate of Agentic AI, RAG, Text2SQL at scale; that is a good proxy for human judgement.
☆32Updated 2 months ago
Alternatives and similar repositories for JudgeIt-LLM-as-a-Judge
Users that are interested in JudgeIt-LLM-as-a-Judge are comparing it to the libraries listed below
Sorting:
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆152Updated last year
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆33Updated last year
- ☆81Updated last month
- Lighter, cheaper and faster RAG toolkit (Graph RAG) supported by TargetPilot☆46Updated 6 months ago
- Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.☆32Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"☆235Updated 2 months ago
- ☆148Updated last year
- [TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Thr…☆32Updated 2 weeks ago
- Submodular optimization for context engineering: query fan-out, text selection, passage reranking☆77Updated 5 months ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"☆137Updated 2 years ago
- ☆49Updated last year
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.☆73Updated 11 months ago
- ☆63Updated last year
- Official Code for Oᴘᴇɴ-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models (EMNLP Findings 2024)☆143Updated 10 months ago
- ☆102Updated last year
- ☆23Updated 11 months ago
- ☆31Updated last year
- The SQL-RL-GEN is an algorithm based on a Reinforcement Learning approach with a reward function generated by a LLM to guide the agent's …☆18Updated 3 months ago
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆88Updated 11 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆125Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆79Updated last year
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performance☆92Updated last year
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".☆242Updated last year
- Evaluation of bm42 sparse indexing algorithm☆72Updated last year
- ☆103Updated 8 months ago
- This repository contains a pipeline for fine-tuning Large Language Models (LLMs) for Text-to-SQL conversion using General Reward Proximal…☆39Updated 8 months ago
- ☆52Updated last year
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆102Updated 4 months ago
- Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers☆81Updated 7 months ago