DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation.
☆137Feb 10, 2026Updated 4 months ago
Alternatives and similar repositories for DeepResearchEval
Users that are interested in DeepResearchEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated 4 months ago
- [CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe☆162Mar 30, 2026Updated 2 months ago
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆60Feb 6, 2026Updated 4 months ago
- Repo for paper "Agentic-R: Learning to Retrieve for Agentic Search" (ACL 2026 Findings)☆86Apr 9, 2026Updated 2 months ago
- ⚔️ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".☆59Feb 23, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆246Aug 27, 2025Updated 9 months ago
- ☆46Dec 16, 2025Updated 5 months ago
- ☆62Jun 7, 2025Updated last year
- Synthetic Data Generation with Execution-Based Verification and Grounding for LLM Training.☆21Feb 7, 2025Updated last year
- [ICLR'26] MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs☆51Apr 17, 2026Updated last month
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆279Aug 12, 2025Updated 9 months ago
- Awesome Audio-Visual Intelligence, Survey of Audio-Visual Intelligence☆77May 8, 2026Updated last month
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19May 25, 2023Updated 3 years ago
- ☆165Mar 18, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Large Language Models Can Self-Improve in Long-context Reasoning☆72Nov 24, 2024Updated last year
- ☆11Dec 8, 2022Updated 3 years ago
- Agentic Learning Powered by AWorld☆109Apr 16, 2026Updated last month
- ☆16May 18, 2026Updated 3 weeks ago
- [ICLR 2024] Adaptive Replay Ratio implementation from 'Revisiting Plasticity in Visual RL: Data, Modules and Training Stages'.☆13Oct 9, 2024Updated last year
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆99Feb 21, 2025Updated last year
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆760May 10, 2026Updated last month
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆13May 5, 2025Updated last year
- [EMNLP 24] Source code for paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tu…☆13Dec 15, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆10May 31, 2021Updated 5 years ago
- A non-official re-implementation of article "[ECCV 18] Image Inpainting for Irregular Holes Using Partial Convolutions"☆12Mar 1, 2025Updated last year
- ☆41May 26, 2026Updated 2 weeks ago
- ☆28Mar 10, 2026Updated 3 months ago
- PICABench: How Far Are We from Physically Realistic Image Editing?☆38Nov 5, 2025Updated 7 months ago
- SimX-OR: Extending Any Simulation Benchmark to Evaluate the Observational Robustness of VLA Models☆33Nov 4, 2025Updated 7 months ago
- [ICLR 2025] <MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses>☆56Nov 12, 2025Updated 6 months ago
- (ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuning☆19Nov 22, 2025Updated 6 months ago
- Code for the paper "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning". Great performance in many environments…☆39Oct 24, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Group Meeting Record for Baobao Chang Group in Peking University☆26May 17, 2021Updated 5 years ago
- This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.☆17Oct 20, 2025Updated 7 months ago
- Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…☆31Oct 10, 2025Updated 8 months ago
- DNN_Partition辅助工具,用于对pytorch模型进行简单的性能分析以及支持模型切分☆14May 31, 2021Updated 5 years ago
- Piece-wise CNN for relation extraction.☆12Oct 22, 2018Updated 7 years ago
- ☆65Dec 10, 2025Updated 6 months ago
- LATTICE turns retrieval into an LLM-driven navigation problem over a semantic scaffold☆37Mar 9, 2026Updated 3 months ago