you.com's framework for evaluating deep research systems.
☆71May 15, 2025Updated 10 months ago
Alternatives and similar repositories for ydc-deep-research-evals
Users that are interested in ydc-deep-research-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An enterprise deep research benchmark☆35Mar 22, 2026Updated last week
- Code repository for BEEP (Biomedical Evidence Enhanced Predictions) clinical outcome prediction system☆26Nov 8, 2023Updated 2 years ago
- ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry☆47Jan 5, 2026Updated 2 months ago
- ☆10Nov 8, 2020Updated 5 years ago
- ☆12May 1, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆27Mar 10, 2026Updated 2 weeks ago
- Temporal and Causal Reasoning (dataset)☆10Apr 19, 2022Updated 3 years ago
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆62Jul 4, 2025Updated 8 months ago
- This project contains the original white paper for Language Construct Modeling (LCM) v1.13, authored by Vincent Shing Hin Chong. It intro…☆15Jul 23, 2025Updated 8 months ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago
- ☆13Mar 4, 2026Updated 3 weeks ago
- 🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…☆41Oct 20, 2025Updated 5 months ago
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆647Updated this week
- Official implementation of Data Contamination Can Cross Language Barriers☆12Sep 11, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for Findings of ACL 2021 paper "Addressing Inquiries about History: An Efficient and Practical Framework for Evaluating Open-domain …☆19Dec 16, 2022Updated 3 years ago
- Python implementation of a simple neural network, including AND, OR, and XOR demos.☆11Jun 13, 2019Updated 6 years ago
- Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification [AI in Medicine Journal]☆12May 20, 2022Updated 3 years ago
- The code for the paper "Building 3D Morphable Models from a Single Scan" (https://arxiv.org/abs/2011.12440).☆18Jan 13, 2021Updated 5 years ago
- Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations is a ServiceNow Research project that was started at Elemen…☆13Jul 31, 2023Updated 2 years ago
- Code and datasets for the ACL 2020 paper "Detecting Perceived Emotions in Hurricane Disasters"☆12Oct 4, 2022Updated 3 years ago
- ☆20Mar 5, 2024Updated 2 years ago
- ☆36Dec 14, 2025Updated 3 months ago
- ☆12Apr 24, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- EQUATE (Evaluating Quantitative Understanding Aptitude in Textual Entailment), framework for evaluating quantitative reasoning ability in…☆14Feb 13, 2022Updated 4 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 5 years ago
- Metaprompt is an AI-powered prompt generator developed by Anthropic. This is the unofficial Metaprompt Community Github repo. All PRs are…☆14Mar 19, 2024Updated 2 years ago
- A toolkit to induce interpretable workflows from raw computer-use activities.☆41Nov 13, 2025Updated 4 months ago
- AkashOS is an unattended install of Ubuntu Server that will become the operating system of the machine. Akash OS will create a Kubernetes…☆13Dec 22, 2025Updated 3 months ago
- 裁判文书数据☆11Oct 26, 2020Updated 5 years ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use☆29Nov 4, 2025Updated 4 months ago
- ☆19Oct 13, 2022Updated 3 years ago
- Official repository for "DEnsity: Open-domain Dialogue Evaluation Metric using Density Estimation (ACL2023 Findings)"☆11May 23, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated 2 years ago
- Dialog Flow Analysis Notebook for Watson Assistant☆29Nov 15, 2022Updated 3 years ago
- Source code for paper "PRiSM: Enhancing Low-Resource Document-Level Relation Extraction with Relation-Aware Score Calibration", Findings …☆11Jun 20, 2025Updated 9 months ago
- Agent framework for generating a synthetic dataset. This will be raw CoT and Reflection output to be cleaned up by a later step.☆15Apr 11, 2025Updated 11 months ago
- Code and Data for "SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting""☆16Feb 2, 2024Updated 2 years ago
- ☆14Oct 30, 2021Updated 4 years ago
- AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and …☆11Feb 11, 2025Updated last year