ServiceNow/drbench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ServiceNow/drbench)

ServiceNow / drbench

An enterprise deep research benchmark

☆40

Alternatives and similar repositories for drbench

Users that are interested in drbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

The-AI-Alliance / cube-standard
View on GitHub
Standardize benchmark wrapping so the community can wrap various otherwise-incompatible benchmarks uniformly and use them everywhere.
☆52Jul 17, 2026Updated last week
ServiceNow / agent-poirot
View on GitHub
☆17May 6, 2025Updated last year
aldro61 / PaperAtlas
View on GitHub
☆24Dec 21, 2025Updated 7 months ago
ServiceNow / AgentAda
View on GitHub
Agent ADA is a comprehensive evaluation and data analytics framework focused on insights generation and skills assessment.
☆15Aug 19, 2025Updated 11 months ago
McGill-NLP / llm2vec-gen
View on GitHub
Code for `LLM2VEC-GEN: Generative Embeddings from Large Language Models`
☆73Apr 5, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
McGill-NLP / retriever-lm-reasoning
View on GitHub
Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…
☆28Nov 2, 2023Updated 2 years ago
McGill-NLP / safearena
View on GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents
☆24Apr 23, 2025Updated last year
WSE-research / LinguaF
View on GitHub
python package for calculating famous measures in computational linguistics
☆15Jun 29, 2026Updated 3 weeks ago
AI-Maker-Space / Chainlit-Event-AIM
View on GitHub
A Chainlit App Used to Showcase: Async, Caching, Additional Chainlit Methods, and more!
☆11Oct 1, 2024Updated last year
henryzhao5852 / BeamDR
View on GitHub
☆15Oct 10, 2021Updated 4 years ago
ServiceNow / PipelineRL
View on GitHub
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆430Updated this week
UKPLab / acl2024-triple-encoders
View on GitHub
triple-encoders is a library for contextualizing distributed Sentence Transformers representations.
☆15Sep 3, 2024Updated last year
yrahal / paircoder
View on GitHub
☆12May 1, 2023Updated 3 years ago
psinger / kaggle-curriculum-solution
View on GitHub
☆17Mar 24, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kyriemao / ChatRetriever
View on GitHub
☆13Apr 18, 2024Updated 2 years ago
xhluca / material-ui-in-pyodide
View on GitHub
☆10Aug 22, 2022Updated 3 years ago
youdotcom-oss / ydc-deep-research-evals
View on GitHub
you.com's framework for evaluating deep research systems.
☆76May 15, 2025Updated last year
alainray / causal_inference
View on GitHub
Repository for my studies of Causal Inference
☆10Dec 1, 2019Updated 6 years ago
ServiceNow / promptmix-emnlp-2023
View on GitHub
Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023
☆12Dec 13, 2023Updated 2 years ago
McGill-NLP / feedbackqa
View on GitHub
FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback
☆12Jul 13, 2022Updated 4 years ago
ServiceNow / WorkArena
View on GitHub
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
☆261Apr 25, 2026Updated 3 months ago
heshenghuan / Prime-Path-Coverage
View on GitHub
☆19Jan 27, 2023Updated 3 years ago
McGill-NLP / AURORA
View on GitHub
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
☆35Jun 30, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
OPPO-PersonalAI / FINDER_DEFT
View on GitHub
Official implementation for paper "How Far Are We from Genuinely Useful Deep Research Agents?"
☆66Dec 10, 2025Updated 7 months ago
AmirAbaskohi / SproutRAG
View on GitHub
SproutRAG is a retrieval-augmented generation stack built for structured, multi-granularity evidence. It combines hierarchical attention-…
☆24Jun 30, 2026Updated 3 weeks ago
skylar-sutherland / single-scan-3dmms
View on GitHub
The code for the paper "Building 3D Morphable Models from a Single Scan" (https://arxiv.org/abs/2011.12440).
☆18Jan 13, 2021Updated 5 years ago
ServiceNow / beyond-trivial-explanations
View on GitHub
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations is a ServiceNow Research project that was started at Elemen…
☆13Jul 31, 2023Updated 2 years ago
McGill-NLP / topiocqa
View on GitHub
Code and data for reproducing baselines for TopiOCQA, an open-domain conversational question-answering dataset
☆57Nov 15, 2023Updated 2 years ago
Jwoo5 / ecg-reasoning-benchmark
View on GitHub
Official repository for distributing ECG-Reasoning-Benchmark dataset
☆15Apr 29, 2026Updated 2 months ago
ServiceNow / insight-bench
View on GitHub
☆73May 6, 2026Updated 2 months ago
Ipouyall / Benchmarking_ChatGPT_for_Persian
View on GitHub
Benchmarking ChatGPT for Persian: A Preliminary Study
☆22Apr 6, 2024Updated 2 years ago
BorgwardtLab / fMRI_Cubical_Persistence
View on GitHub
Code of our NeurIPS 2020 publication 'Uncovering the Topology of Time-Varying fMRI Data using Cubical Persistence'
☆24Oct 22, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
VAGOsolutions / SauerkrautLM-Doom-MultiVec
View on GitHub
A tiny 1.3M parameter model that plays DOOM, outperforming LLMs up to 92,000x its size.
☆26May 11, 2026Updated 2 months ago
EhsanAghazadeh / Metaphors_in_PLMs
View on GitHub
Probing and Generalization of Metaphorical Knowledge in Pre-Trained Language Modelss[ACL 2022]
☆23May 15, 2022Updated 4 years ago
FujitsuResearch / FieldWorkArena
View on GitHub
An evaluation environment for video analytics AI agent service.
☆15May 22, 2026Updated 2 months ago
UNITES-Lab / Occult
View on GitHub
[ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…
☆13Apr 17, 2025Updated last year
dmis-lab / TouR
View on GitHub
Findings of ACL'2023: Optimizing Test-Time Query Representations for Dense Retrieval
☆30Oct 24, 2023Updated 2 years ago
jixuan-wang / Grad2Task
View on GitHub
Codes for the paper "Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation"
☆14Nov 24, 2022Updated 3 years ago
owlbarn / owl_mask_rcnn
View on GitHub
Implementation of the Mask R-CNN model using OCaml's numerical library Owl.
☆19Jan 30, 2020Updated 6 years ago