DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation.
☆127Feb 10, 2026Updated 2 weeks ago
Alternatives and similar repositories for DeepResearchEval
Users that are interested in DeepResearchEval are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated last month
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆59Feb 6, 2026Updated 3 weeks ago
- ⚔️ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".☆49Feb 23, 2026Updated last week
- MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs☆38Feb 19, 2026Updated last week
- PICABench: How Far Are We from Physically Realistic Image Editing?☆36Nov 5, 2025Updated 3 months ago
- The code for the paper *The Sensitivity of Counterfactual Fairness to Unmeasured Confounding* @ UAI 2019☆14Apr 4, 2020Updated 5 years ago
- Official repository of the EMNLP'2020 paper "Amalgamating Knowledge from Two Teachers for Task-oriented Dialogue System with Adversarial …☆16Dec 9, 2021Updated 4 years ago
- SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention☆56Dec 6, 2025Updated 2 months ago
- ☆40Dec 16, 2025Updated 2 months ago
- A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.☆146Feb 23, 2026Updated last week
- ☆21Nov 13, 2025Updated 3 months ago
- An LLM Chatbot based on LangGraph and LangChain that dynamically retrieves and processes resumes using RAG to perform resume screening.☆27Aug 29, 2024Updated last year
- [AAAI 2026] SlideTailor: Personalized Presentation Slide Generation for Scientific Papers☆43Jan 1, 2026Updated 2 months ago
- RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing☆58Dec 26, 2025Updated 2 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- ☆68Sep 15, 2025Updated 5 months ago
- Original PyTorch implementation for AAAI 2021 Paper "Meta-Transfer Learning for Low-Resrouce Abstractive Summarization."☆26Jan 11, 2023Updated 3 years ago
- ☆58Dec 10, 2025Updated 2 months ago
- Group Meeting Record for Baobao Chang Group in Peking University☆26May 17, 2021Updated 4 years ago
- Multi-step AI agents powered by Gemini 2.0 and the LangGraph framework. These agents orchestrate complex workflows and enhance their reas…☆10Dec 19, 2024Updated last year
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆233Aug 27, 2025Updated 6 months ago
- Winning solution for the Kaggle Feedback Prize Challenge.☆66Sep 5, 2022Updated 3 years ago
- ☆88Jan 9, 2026Updated last month
- MiroTrain is an efficient and algorithm-first framework research agent.☆133Aug 27, 2025Updated 6 months ago
- A set of examples based on verl for end-to-end RL training recipes.☆183Feb 10, 2026Updated 2 weeks ago
- ThinkGen: Generalized Thinking for Visual Generation☆51Dec 30, 2025Updated 2 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆705Oct 15, 2025Updated 4 months ago
- ASTRA is an end-to-end system for synthesizing agentic trajectories and rule-verifiable environments for SFT and RL training, developed b…☆114Jan 30, 2026Updated last month
- 北语 246 实验室新生简明指南☆10May 30, 2022Updated 3 years ago
- A collection of awesome think with videos papers.☆90Dec 1, 2025Updated 3 months ago
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 3 months ago
- AI-native knowledge kernel for human/agent collaboration. Use it as a Knowledge Base, Wiki, Annotator, Research Tool, or Agentic Memory.☆29Updated this week
- multicast learning in network programming course☆10Oct 30, 2020Updated 5 years ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆256Aug 12, 2025Updated 6 months ago
- 🔍 Awesome Agentic Search is a curated list of papers, tools, and resources on agentic search—where AI agents plan, search, and reason to…☆54Aug 28, 2025Updated 6 months ago
- Code repo for "LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners"☆77May 30, 2025Updated 9 months ago
- ☆16Sep 17, 2024Updated last year
- Auction Theory Toolbox – Computer Verified Auctions☆14Jul 12, 2016Updated 9 years ago
- Solutions to Ireland, Rosen exercises in "A Classical Introduction to Modern Number Theory"☆13Nov 7, 2024Updated last year