OPPO-PersonalAI / FINDER_DEFTLinks
Official implementation for paper "How Far Are We from Genuinely Useful Deep Research Agents?"
☆51Updated last week
Alternatives and similar repositories for FINDER_DEFT
Users that are interested in FINDER_DEFT are comparing it to the libraries listed below
Sorting:
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆106Updated 2 months ago
- SSRL: Self-Search Reinforcement Learning☆158Updated 4 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆51Updated 6 months ago
- ☆60Updated last year
- RL Scaling and Test-Time Scaling (ICML'25)☆112Updated 10 months ago
- [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆83Updated last year
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆108Updated 9 months ago
- The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"☆55Updated last month
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆43Updated last year
- ☆70Updated last year
- Official Implementation of "Reasoning Language Models: A Blueprint"☆92Updated 4 months ago
- Verifiers for LLM Reinforcement Learning☆80Updated 8 months ago
- [ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…☆68Updated last year
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆84Updated 8 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆80Updated 2 months ago
- ☆35Updated last year
- Process Reward Models That Think☆64Updated 3 weeks ago
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆78Updated last year
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆67Updated 7 months ago
- Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs☆40Updated last year
- BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent☆130Updated last week
- ☆86Updated 4 months ago
- [EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward☆58Updated 4 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"☆89Updated 2 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆105Updated 7 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆29Updated 2 weeks ago
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Updated 8 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆63Updated last year