youdotcom-oss/ydc-deep-research-evals

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/youdotcom-oss/ydc-deep-research-evals)

youdotcom-oss / ydc-deep-research-evals

you.com's framework for evaluating deep research systems.

☆76

Alternatives and similar repositories for ydc-deep-research-evals

Users that are interested in ydc-deep-research-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dzungvpham / awesome-llm4privacy
View on GitHub
A curated collection of papers and related projects on using LLMs for privacy.
☆31Oct 8, 2025Updated 7 months ago
allenai / BEEP
View on GitHub
Code repository for BEEP (Biomedical Evidence Enhanced Predictions) clinical outcome prediction system
☆26Nov 8, 2023Updated 2 years ago
GAIR-NLP / ResearcherBench
View on GitHub
ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
☆49Jan 5, 2026Updated 4 months ago
sjtu-sai-agents / Browse-Master
View on GitHub
Official implementation of Browse-Master, a tool-augmented web-search agent.
☆31Aug 22, 2025Updated 9 months ago
ygcinar / SmoothI
View on GitHub
☆10May 9, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
esdurmus / feqa
View on GitHub
Data and code for "A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization" (ACL 2020)
☆48Jun 12, 2023Updated 2 years ago
chonghin33 / lcm-1.13-whitepaper
View on GitHub
This project contains the original white paper for Language Construct Modeling (LCM) v1.13, authored by Vincent Shing Hin Chong. It intro…
☆15Jul 23, 2025Updated 10 months ago
streamingfast / substreams-rs
View on GitHub
☆13May 7, 2026Updated 3 weeks ago
ByteDance-BandAI / LLM-I
View on GitHub
🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…
☆41Oct 20, 2025Updated 7 months ago
yfyuan01 / MQC
View on GitHub
☆12Nov 5, 2024Updated last year
shangdatalab / Deep-Contam
View on GitHub
Official implementation of Data Contamination Can Cross Language Barriers
☆12Sep 11, 2024Updated last year
AnswerDotAI / fastcaddy
View on GitHub
A simple python wrapper for using the Caddy API
☆27May 20, 2026Updated last week
Ayanami0730 / deep_research_bench
View on GitHub
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
☆730May 11, 2026Updated 2 weeks ago
ictnlp / AIH
View on GitHub
Code for Findings of ACL 2021 paper "Addressing Inquiries about History: An Efficient and Practical Framework for Evaluating Open-domain …
☆19Dec 16, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ma787639046 / bowdpr
View on GitHub
[SIGIR24] Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval
☆18Feb 29, 2024Updated 2 years ago
brightjade / SimCKP
View on GitHub
Source code for "SimCKP: Simple Contrastive Learning of Keyphrase Representations", Findings of EMNLP 2023
☆12Jun 20, 2025Updated 11 months ago
OpenFinArena / OpenFinArena
View on GitHub
Benchmarking LLMs and Agents in Rigorous Financial Analysis and Forecast
☆24May 10, 2026Updated 2 weeks ago
shreydesai / hurricane
View on GitHub
Code and datasets for the ACL 2020 paper "Detecting Perceived Emotions in Hurricane Disasters"
☆12Oct 4, 2022Updated 3 years ago
forecastingresearch / forecastbench-datasets
View on GitHub
Forecastbench Datasets, updated nightly
☆28May 21, 2026Updated last week
allenai / clarifydelphi
View on GitHub
☆12Apr 24, 2024Updated 2 years ago
itjieluo / EcommerceShoppingGuideRobot
View on GitHub
基于电商导购机器人，自然语言理解（NLU），文本纠错，歧义词消歧
☆12May 5, 2020Updated 6 years ago
pizofreude / metaprompt
View on GitHub
Metaprompt is an AI-powered prompt generator developed by Anthropic. This is the unofficial Metaprompt Community Github repo. All PRs are…
☆14Mar 19, 2024Updated 2 years ago
zorazrw / workflow-induction-toolkit
View on GitHub
A toolkit to induce interpretable workflows from raw computer-use activities.
☆44Nov 13, 2025Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cryptoandcoffee / akashos
View on GitHub
AkashOS is an unattended install of Ubuntu Server that will become the operating system of the machine. Akash OS will create a Kubernetes…
☆13Dec 22, 2025Updated 5 months ago
dengmengjie / ToolScope
View on GitHub
Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use
☆30Nov 4, 2025Updated 6 months ago
elehman16 / discq
View on GitHub
☆19Oct 13, 2022Updated 3 years ago
ryantzr1 / OpenAlita
View on GitHub
Open Source Implementation of Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evo…
☆100Jul 18, 2025Updated 10 months ago
brightjade / PRiSM
View on GitHub
Source code for paper "PRiSM: Enhancing Low-Resource Document-Level Relation Extraction with Relation-Aware Score Calibration", Findings …
☆11Jun 20, 2025Updated 11 months ago
IntelLabs / atlas-cli
View on GitHub
A command-line interface tool for creating, managing, and verifying Content Provenance and Authenticity (C2PA) manifests for machine lear…
☆22May 22, 2026Updated last week
DataBassGit / o7
View on GitHub
Agent framework for generating a synthetic dataset. This will be raw CoT and Reflection output to be cleaned up by a later step.
☆17Apr 11, 2025Updated last year
yecchen / GDELT-ComplexEvent
View on GitHub
Code and Data for "SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting""
☆16Feb 2, 2024Updated 2 years ago
haydenbleasel / tiptap-extensions
View on GitHub
A collection of Tiptap extensions, versioned and released independently.
☆25May 3, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
RUC-NLPIR / RAG-Reading-List
View on GitHub
RAG methods, benchmarks, and toolkits
☆19Nov 28, 2024Updated last year
heejin928 / How-Positive-Are-You-Text-Style-Transfer-using-Adaptive-Style-Embedding
View on GitHub
☆14Oct 30, 2021Updated 4 years ago
RedisLabs / vault-plugin-database-redis-enterprise
View on GitHub
HashiCorp Vault Plugins for Redis Enterprise
☆16May 13, 2026Updated 2 weeks ago
Stability-AI / deep-learning-containers
View on GitHub
AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and …
☆11Feb 11, 2025Updated last year
ricsinaruto / NeuralChatbots-DataFiltering
View on GitHub
Filter dialog data with a simple entropy-based method (see ACL paper)
☆14Oct 4, 2019Updated 6 years ago
DidierRLopes / obb-zeitgeist
View on GitHub
prediction markets -> llm -> news
☆25Nov 10, 2025Updated 6 months ago
Finndersen / pydanticai_mcp_demo
View on GitHub
Demonstrate using MCP with Pydantic AI framework
☆14Mar 14, 2025Updated last year