RulinShao/RAG-evaluation-harnesses

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RulinShao/RAG-evaluation-harnesses)

RulinShao / RAG-evaluation-harnesses

An evaluation suite for Retrieval-Augmented Generation (RAG).

☆25

Alternatives and similar repositories for RAG-evaluation-harnesses

Users that are interested in RAG-evaluation-harnesses are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stefanhgm / patient_summaries_with_llms
View on GitHub
Code for "A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models"
☆17Jul 20, 2025Updated last year
JinLi-i / MoDiCF
View on GitHub
The source code of [WWW 2025] MoDiCF
☆16Mar 26, 2026Updated 4 months ago
ncsu-dk-lab / Acc-DD
View on GitHub
☆14Apr 21, 2023Updated 3 years ago
TraceElephant / TraceElephant
View on GitHub
Repo of "Seeing the Whole Elephant: A Benchmark for Failure Attribution in LLM-based Multi-Agent Systems" (ACL 2026)
☆16Apr 27, 2026Updated 3 months ago
CogComp / faithful_summarization
View on GitHub
☆18May 5, 2021Updated 5 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
RulinShao / massive-serve
View on GitHub
Python package for serving a local search engine. One command to download and serve a datastore---that's it 😎.
☆26Jun 6, 2025Updated last year
levilelis / h-levin
View on GitHub
Levin tree search guided by both a policy and a heuristic function
☆19Jul 13, 2023Updated 3 years ago
haolunc / iGSM-Replication-physics-LLM
View on GitHub
This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.
☆17Sep 13, 2024Updated last year
VincenDen / IID
View on GitHub
Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation (CVPR24)
☆10Jun 16, 2024Updated 2 years ago
Victorwz / LaViA
View on GitHub
☆10Jul 13, 2024Updated 2 years ago
Lichang-Chen / AlpaGasus
View on GitHub
A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)
☆24Jul 26, 2024Updated 2 years ago
ZNLP / Language-Imbalance-Driven-Rewarding
View on GitHub
[ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving
☆25Apr 6, 2026Updated 3 months ago
SciMT / SciMT-benchmark
View on GitHub
☆11Jan 3, 2024Updated 2 years ago
lych1233 / GAMMA-human-ai-collaboration
View on GitHub
☆11Jan 13, 2026Updated 6 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
nhattruongpham / mmser
View on GitHub
SERVER: Multi-modal Speech Emotion Recognition using Transformer-based and Vision-based Embeddings
☆15Jan 23, 2024Updated 2 years ago
rvenet / RVENet
View on GitHub
Source code related to the research paper entitled RVENet: A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Ri…
☆12Mar 10, 2024Updated 2 years ago
Planetary-Soft-Landing-at-Purdue / PSL-with-KSP
View on GitHub
The goal of this project is to develop a program for planetary soft landings using lossless convexification of non convex control bounds.
☆12Mar 25, 2022Updated 4 years ago
kfq20 / AEGIS
View on GitHub
AEGIS: Automated Error Generation and Attribution for Multi-Agent Systems
☆25Feb 28, 2026Updated 4 months ago
dair-iitd / tkbi
View on GitHub
☆25May 29, 2022Updated 4 years ago
liziniu / HyperDQN
View on GitHub
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
☆12Nov 28, 2023Updated 2 years ago
abbyvansoest / maxent
View on GitHub
☆14May 30, 2019Updated 7 years ago
AkideLiu / MiniCache
View on GitHub
☆14Sep 7, 2024Updated last year
apple / ml-spin
View on GitHub
This repository contains the official implementation for the ECCV'22 paper, "SPIN: An Empirical Evaluation on Sharing Parameters of Isotr…
☆20Sep 9, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
caiqizh / LUQ
View on GitHub
☆14Jan 14, 2026Updated 6 months ago
DarshanDeshpande / tfrecord-generator
View on GitHub
This repository contains scripts for conversion of data required for most commonly found Machine Learning tasks to TFRecords
☆13Mar 6, 2021Updated 5 years ago
kahache / video_packaging_platform
View on GitHub
Video packaging platform - this will build a Docker with a web API that will let you upload, encrypt and serve videos as MPEG DASH files
☆10Jul 1, 2026Updated 3 weeks ago
liziniu / policy_optimization
View on GitHub
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆29Dec 19, 2023Updated 2 years ago
saschaschramm / MonteCarloTreeSearch
View on GitHub
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
☆10May 30, 2018Updated 8 years ago
SYXieee / SuperGS
View on GitHub
SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning
☆11May 24, 2025Updated last year
aisa-group / InferenceBench
View on GitHub
Benchmarking Open-Ended Inference Optimization by AI Agents
☆32Jul 6, 2026Updated 2 weeks ago
robintyh1 / icml2021-pengqlambda
View on GitHub
Revisiting Peng's Q(lambda) for Modern Reinforcement Learning
☆15Jul 23, 2021Updated 5 years ago
USTC-StarTeam / ZIP
View on GitHub
arXiv 2024 | ZIP: entropy-law data selection for efficient LLM alignment.
☆28Jun 10, 2026Updated last month
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
mingyin1 / Agents_Failure_Attribution
View on GitHub
Benchmark for automated failure attributions in agentic systems (🏆 ICML 2025 Spotlight)
☆24Jan 6, 2026Updated 6 months ago
netspooky / uJunk
View on GitHub
An unsorted collection of little tools and scripts I've made that don't fit anywhere else
☆19Jul 15, 2022Updated 4 years ago
tangzhy / RealCritic
View on GitHub
☆15Jan 27, 2025Updated last year
lmb-freiburg / td-or-not-td
View on GitHub
Code for the paper "TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning", Artemij Amiranashvili, Ale…
☆12Aug 24, 2018Updated 7 years ago
kennethorq / SMORE
View on GitHub
[WSDM 2025] Source code for "Spectrum-based Modality Representation Fusion Graph Convolutional Network for Multimodal Recommendation".
☆36Dec 22, 2024Updated last year
yixiaoer / tpux
View on GitHub
A set of Python scripts that makes your experience on TPU better
☆56Sep 18, 2025Updated 10 months ago
EleutherAI / pile_dedupe
View on GitHub
Pile Deduplication Code
☆18May 15, 2023Updated 3 years ago