ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
☆47Jan 5, 2026Updated 3 months ago
Alternatives and similar repositories for ResearcherBench
Users that are interested in ResearcherBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Yet anoter BUPT Bachelor Thesis LaTeX class. 北京邮电大学本科毕业设计(论文)LaTeX模板☆18Mar 6, 2026Updated last month
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆105Feb 28, 2026Updated last month
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)☆19Jul 19, 2025Updated 8 months ago
- ☆42Dec 16, 2025Updated 3 months ago
- [NeurIPS 2025 D&B Track] Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"☆43May 22, 2025Updated 10 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆27Mar 10, 2026Updated last month
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆38Jun 23, 2025Updated 9 months ago
- Transferability of Natural Language Inference to Biomedical Question Answering☆12Mar 25, 2021Updated 5 years ago
- ☆13Oct 3, 2023Updated 2 years ago
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆676Updated this week
- AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (EMNLP 2024 Findings)☆16Dec 30, 2024Updated last year
- Minimal PyTorch implementation of TP, SP, FSDP and sharded-EMA☆32Nov 27, 2025Updated 4 months ago
- ☆14Oct 21, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [WSDM 2024 Best Paper Honorable Mention] Debiasing Sequential Recommenders through Distributionally Robust Optimization over System Expos…☆15Jun 20, 2024Updated last year
- [EMNLP2022] Transformer-based Entity Typing in Knowledge Graphs☆16Nov 26, 2024Updated last year
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆13May 5, 2025Updated 11 months ago
- LaTeX Beamer template crafted for University of Illinois Chicago☆11Dec 7, 2024Updated last year
- ☆19Jan 16, 2022Updated 4 years ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- Convert MathML to Latex for OneNote to Markdown☆13Mar 17, 2026Updated 3 weeks ago
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆12Feb 10, 2025Updated last year
- ☆13Jul 2, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [MM 2023 Oral] Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation☆17Jan 10, 2024Updated 2 years ago
- [ICLR 2025] Official Implementation of "TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting"☆24Mar 10, 2025Updated last year
- [CCIR 2023] Self-supervised learning for Sequential Recommender Systems☆24Nov 7, 2023Updated 2 years ago
- ☆20Jan 26, 2026Updated 2 months ago
- Top Picks for Data Science Self-Study: From Newbies to Pros!☆11Apr 2, 2024Updated 2 years ago
- Description for MV-MATH☆15Jul 20, 2025Updated 8 months ago
- [CVPR2025] Official implementation of RAM☆29Nov 4, 2025Updated 5 months ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 6 months ago
- ☆99Aug 8, 2025Updated 8 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Python Vector Search tutorial generated using gpt4☆12Mar 18, 2023Updated 3 years ago
- Official codebase for the ACL 2025 Findings paper: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval.☆21Jul 26, 2025Updated 8 months ago
- ☆18Mar 3, 2025Updated last year
- Code for Research Project TLDR☆25Jul 28, 2025Updated 8 months ago
- ☆26Mar 17, 2025Updated last year
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆724Oct 15, 2025Updated 5 months ago
- Official code of "AutoSNN: Towards Energy-Efficient Spiking Neural Networks," ICML22☆18May 29, 2022Updated 3 years ago