sujitpal / llm-rag-evalLinks
Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.
β29Updated last year
Alternatives and similar repositories for llm-rag-eval
Users that are interested in llm-rag-eval are comparing it to the libraries listed below
Sorting:
- π§ Compare how Agent systems perform on several benchmarks. ππβ98Updated 8 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)β115Updated 5 months ago
- LLM reads a paper and produce a working prototypeβ58Updated 3 months ago
- β50Updated 2 weeks ago
- Train your own SOTA deductive reasoning modelβ96Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated 9 months ago
- Simple examples using Argilla tools to build AIβ53Updated 7 months ago
- accompanying material for sleep-time compute paperβ97Updated 2 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.β32Updated 3 months ago
- β56Updated 7 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" π€β71Updated 7 months ago
- Automating enterprise workflows with multimodal agentsβ108Updated 9 months ago
- β71Updated 4 months ago
- β76Updated 6 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ106Updated 7 months ago
- β69Updated last month
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β91Updated 5 months ago
- β144Updated 11 months ago
- Code for ScribeAgent paperβ58Updated 4 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive argumentsβ84Updated 9 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.β48Updated last year
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systemsβ95Updated last month
- β40Updated 7 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optunaβ54Updated 5 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".β67Updated last year
- β96Updated 10 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paperβ¦β107Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ108Updated 3 months ago
- Deep Research through Multi-Agents, using GraphRAGβ76Updated 8 months ago