Episoode/Double-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Episoode/Double-Bench)

Episoode / Double-Bench

[AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?

☆31

Alternatives and similar repositories for Double-Bench

Users that are interested in Double-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Zhaoyang-Chu / code-unlearning
View on GitHub
This repository contains a PyTorch implementation of the ICSE'26 paper "Scrub It Out! Erasing Sensitive Memorization in Code Language Mod…
☆30Sep 18, 2025Updated 10 months ago
MMDocRAG / MMDocRAG
View on GitHub
The code used to train and run inference with MMDocRAG
☆21Nov 6, 2025Updated 8 months ago
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
Dongping-Chen / ISG
View on GitHub
(ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.
☆31Aug 7, 2025Updated 11 months ago
allenai / DrawEduMath
View on GitHub
Can VLMs understand students' hand-drawn math work?
☆19Jan 20, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MileBench / MileBench
View on GitHub
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
☆38Jul 11, 2024Updated 2 years ago
Dongping-Chen / Clawatar
View on GitHub
From Agentic Intelligence to Interactive Intelligence. Give your AI agent a body and home.
☆19Feb 22, 2026Updated 5 months ago
360CVGroup / RzenEmbed
View on GitHub
Embedding model prioritized towards Multimodal RAG, overall + VisDoc double top1 on MMEB benchmark
☆36Jun 16, 2026Updated last month
walkalone20 / HUST_RISCV-CPU
View on GitHub
华中科技大学计算机组成原理课程设计
☆36Dec 28, 2022Updated 3 years ago
lezhang7 / Rearank
View on GitHub
[EMNLP 2025] Official codebase for Rearank: Reasoning Re-ranking Agent
☆40Aug 20, 2025Updated 11 months ago
Fu-Dayuan / AgentRefine
View on GitHub
(ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuning
☆20Nov 22, 2025Updated 8 months ago
jina-ai / jina-vdr
View on GitHub
Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval
☆38Aug 4, 2025Updated 11 months ago
OpenGVLab / Docopilot
View on GitHub
[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆37Jul 22, 2025Updated last year
VidCapBench / VidCapBench
View on GitHub
☆13May 17, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
SalesforceAIResearch / UniDoc-Bench
View on GitHub
☆38Jun 2, 2026Updated last month
MMDocRAG / MMDocIR
View on GitHub
The code used to train and run inference with MMDocIR
☆34May 29, 2025Updated last year
adxcreative / EERCF
View on GitHub
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning
☆21Feb 19, 2025Updated last year
niuliang42 / CodexLeaks
View on GitHub
CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot
☆11Jul 11, 2023Updated 3 years ago
bigcode-project / pii-lib
View on GitHub
Code for PII detection and redaction in code datasets
☆15Jan 24, 2023Updated 3 years ago
nttmdlab-nlp / VDocRAG
View on GitHub
[CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents
☆66May 26, 2025Updated last year
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 3 months ago
Mungeryang / colqwen3
View on GitHub
The code used to train and run inference with the ColQwen3 model. Welcome to follow and star! ⭐️⭐️⭐️ https://huggingface.co/goodman2001/…
☆15Jul 4, 2026Updated 2 weeks ago
Lslland / T-Vaccine
View on GitHub
☆19Jun 21, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ruc-datalab / SC-prompt
View on GitHub
☆12May 13, 2023Updated 3 years ago
VIM-Bench / VIM_TOOL
View on GitHub
☆12Jun 12, 2024Updated 2 years ago
EnVision-Research / PhysToolBench
View on GitHub
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
☆30Updated this week
ejmichaud / neural-verification
View on GitHub
MI and Formal Verification of NNs on Algorithmic tasks!
☆18Mar 18, 2024Updated 2 years ago
Alibaba-NLP / ViDoRAG
View on GitHub
[EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
☆669Jan 11, 2026Updated 6 months ago
GaryGuTC / UniME-v2
View on GitHub
[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"
☆74Dec 8, 2025Updated 7 months ago
TrustGen / TrustEval-toolkit
View on GitHub
[ICLR'26, NAACL'25 Demo] Toolkit & Benchmark for evaluating the trustworthiness of generative foundation models.
☆132Aug 22, 2025Updated 11 months ago
yejinc00 / PREMIR
View on GitHub
[EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"
☆15Aug 26, 2025Updated 10 months ago
JohnJiang12138 / CMRP
View on GitHub
Cross-modal Reinforced Prompting for Graph and Language Tasks, KDD 2024.
☆11Sep 29, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
facebookresearch / mmd
View on GitHub
ML models often mispredict, and it is hard to tell when and why. We present a data mining based approach to discover whether there is a c…
☆17Jun 6, 2022Updated 4 years ago
YqjMartin / AgenticRAGTracer
View on GitHub
AgenticRAGTracer: A Hop-Aware Benchmark for Diagnosing Multi-Step Retrieval Reasoning in Agentic RAG [ACL'26]
☆18Jun 30, 2026Updated 3 weeks ago
zcysky / HUST-CS-ProblemList
View on GitHub
A problem list for HUST-CS.
☆19Jan 16, 2022Updated 4 years ago
Institut-Polytechnique-de-Paris / time-disentanglement-lib
View on GitHub
🤗 [ICLR 2024] Disentangling Time Series Representations via Contrastive based l-Variational Inference
☆20Dec 11, 2025Updated 7 months ago
Gabesarch / ICAL
View on GitHub
☆53May 11, 2025Updated last year
gary21978 / wlsfilter
View on GitHub
Weighted least-squares filtering
☆14Jan 12, 2021Updated 5 years ago
MaYufei-NPU / InfoGain-RAG
View on GitHub
Implementation of EMNLP Oral Paper: InfoGain-RAG: Boosting Retrieval-Augmented Generation through Document Information Gain-based Reranki…
☆18Sep 17, 2025Updated 10 months ago