[NeurIPS 2025 Spotlight] Official repository for "Web-Shepherd: Advancing PRMs for Reinforcing Web Agents"
☆54May 21, 2025Updated last year
Alternatives and similar repositories for Web-Shepherd
Users that are interested in Web-Shepherd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 2025] "World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning." https://arxiv.org/abs/2503.1…☆18Jul 22, 2025Updated 10 months ago
- ☆12Aug 8, 2024Updated last year
- 🎮Manipulates mobile phones just like how you would. Official code for "MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficien…☆28Oct 10, 2025Updated 7 months ago
- Official repo for StyleMe3D☆30Apr 22, 2025Updated last year
- ☆10Aug 16, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Under construction☆13Jan 15, 2025Updated last year
- ☆12Jul 4, 2024Updated last year
- This is the repository for paper "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models"☆31Oct 8, 2023Updated 2 years ago
- ☆22May 3, 2025Updated last year
- ☆13Aug 4, 2025Updated 9 months ago
- ☆12Dec 6, 2024Updated last year
- ☆18Mar 2, 2026Updated 2 months ago
- ☆14Dec 25, 2024Updated last year
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆44Aug 7, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…☆56Nov 5, 2024Updated last year
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆14May 16, 2025Updated last year
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated 3 months ago
- ☆28Aug 19, 2025Updated 9 months ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year
- [ICLR 2026] AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆41Apr 17, 2026Updated last month
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆35Jun 23, 2025Updated 11 months ago
- ☆32Sep 27, 2024Updated last year
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- UnOfficial Gradio Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Y…☆16Sep 30, 2024Updated last year
- ☆152May 14, 2025Updated last year
- Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning☆15Jun 24, 2025Updated 10 months ago
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"☆24Mar 18, 2025Updated last year
- Official code repo for NeurIPS 2025 Spotlight paper, "Debate or Vote: Which Yields Better Decisions in Multi-Agent LLMs?"☆73Oct 15, 2025Updated 7 months ago
- A very hacky set of functions for getting plotly to do what I want when doing mech interp research, designed to be compatible with PyTorc…☆13Jun 16, 2023Updated 2 years ago
- ☆36Jan 15, 2026Updated 4 months ago
- [ACL 2026 Findings] CoV: Chain-of-View Prompting for Spatial Reasoning☆60Apr 7, 2026Updated last month
- TopViewRS: Vision-Language Models as Top-View Spatial Reasoners (EMNLP 2024 Oral)☆15Jun 14, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11Jul 21, 2024Updated last year
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆29Jun 3, 2025Updated 11 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆149Nov 26, 2024Updated last year
- Collection of papers about video-audio understanding☆25Dec 26, 2025Updated 4 months ago
- (ICLR 2025) The Official Code Repository for GUI-World.☆69Dec 18, 2024Updated last year
- ☆121Apr 8, 2025Updated last year
- ☆12Jun 18, 2024Updated last year