awslabs/rag-qa-arena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/awslabs/rag-qa-arena)

awslabs / rag-qa-arena

☆53

Alternatives and similar repositories for rag-qa-arena

Users that are interested in rag-qa-arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

awslabs / robustqa-acl23
View on GitHub
☆20Mar 22, 2024Updated 2 years ago
THU-KEG / R-Eval
View on GitHub
[KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models
☆11Apr 9, 2024Updated 2 years ago
HarlynDN / WebCiteS
View on GitHub
[ACL'24] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
☆13Sep 11, 2024Updated last year
armingh2000 / FactScoreLite
View on GitHub
FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…
☆14Apr 25, 2024Updated 2 years ago
McGill-NLP / retriever-lm-reasoning
View on GitHub
Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…
☆28Nov 2, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
xlang-ai / BRIGHT
View on GitHub
[ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
☆210Sep 13, 2025Updated 10 months ago
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Jul 19, 2026Updated last week
facebookresearch / CRAG
View on GitHub
Comprehensive benchmark for RAG
☆297Jun 14, 2025Updated last year
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
yale-nlp / ODSum
View on GitHub
Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"
☆11Sep 20, 2024Updated last year
googleinterns / localizing-paragraph-memorization
View on GitHub
☆15Feb 21, 2024Updated 2 years ago
Betswish / MIRAGE
View on GitHub
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/
☆25Mar 10, 2025Updated last year
TIGER-AI-Lab / LLM-AMT
View on GitHub
This repository contains the code for our paper "Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering" [EMNLP…
☆14Oct 8, 2024Updated last year
horizon-llm / Think-RM
View on GitHub
[NeurIPS 2025] Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
☆17Nov 2, 2025Updated 8 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
HansiZeng / PAG
View on GitHub
[SIGIR 2024] The official repo for paper "Planning Ahead in Generative Retrieval: Guiding Autoregressive Generation through Simultaneous …
☆32Apr 24, 2024Updated 2 years ago
microsoft / zkbc
View on GitHub
Early prototype of zero-knowledge verifiable ML benchmarks
☆15Nov 15, 2024Updated last year
primeqa / clapnq
View on GitHub
☆46Jan 21, 2025Updated last year
gabriben / awesome-generative-information-retrieval
View on GitHub
☆728Oct 7, 2025Updated 9 months ago
s-vco / s-vco
View on GitHub
Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images
☆19Jun 4, 2025Updated last year
falcondai / pyrouge
View on GitHub
A Python wrapper for the ROUGE summarization evaluation package
☆14Aug 9, 2017Updated 8 years ago
yumoxu / marge
View on GitHub
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources
☆33Jul 15, 2022Updated 4 years ago
algoprog / SynTOD
View on GitHub
Synthetic data generation for TODs
☆23Jul 17, 2024Updated 2 years ago
ictnlp / TLAT-NMT
View on GitHub
Source code for the EMNLP 2020 long paper <Token-level Adaptive Training for Neural Machine Translation>.
☆20Oct 28, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AIM3-RUC / MPMQA
View on GitHub
Official repository of the paper MPMQA: Multimodal Question Answering on Product Manuals (AAAI 2023)
☆21Nov 28, 2022Updated 3 years ago
MexicanLemonade / LLM-Misinfo-QA
View on GitHub
This repository contains data and code used for On the Risk of Misinformation Pollution with Large Language Models (EMNLP 2023 Findings).
☆17Dec 14, 2023Updated 2 years ago
varshakishore / IncDSI
View on GitHub
☆11Sep 10, 2023Updated 2 years ago
katiekang1998 / llm_hallucinations
View on GitHub
☆18May 28, 2024Updated 2 years ago
microsoft / livedrbench
View on GitHub
Live Deep Research Bench. A challenging, objective benchmark for deep research tasks.
☆20Oct 16, 2025Updated 9 months ago
princeton-pli / QRHead
View on GitHub
QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
☆40Jan 20, 2026Updated 6 months ago
THUDM / LongCite
View on GitHub
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
☆520Dec 31, 2024Updated last year
MTTeql / MT-Teql
View on GitHub
Research Artifact For Our Submission To VLDB
☆11Oct 27, 2021Updated 4 years ago
zhiyuanpeng / SPTAR
View on GitHub
Soft Prompt Tuning for Augmenting Dense Retrieval with Large Language Models
☆16Feb 19, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
XinyuanLu00 / SciTab
View on GitHub
The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"
☆23Dec 21, 2023Updated 2 years ago
zhudotexe / fanoutqa
View on GitHub
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
☆62Apr 3, 2026Updated 3 months ago
google-deepmind / loft
View on GitHub
LOFT: A 1 Million+ Token Long-Context Benchmark
☆237Apr 13, 2026Updated 3 months ago
OPPO-Mente-Lab / AndesVL_Evaluation
View on GitHub
☆26Apr 15, 2026Updated 3 months ago
chanchimin / RQ-RAG
View on GitHub
Codes for our paper "RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation"
☆212Aug 16, 2024Updated last year
g588928812 / qlora
View on GitHub
QLoRA: Efficient Finetuning of Quantized LLMs
☆11Jul 22, 2023Updated 3 years ago
cxcscmu / General-AgentBench
View on GitHub
Benchmark Test-Time Scaling of General LLM Agents
☆20Apr 14, 2026Updated 3 months ago