kyle8581 / Web-ShepherdLinks
Official repository for "Web-Shepherd: Advancing PRMs for Reinforcing Web Agents"
☆36Updated last month
Alternatives and similar repositories for Web-Shepherd
Users that are interested in Web-Shepherd are comparing it to the libraries listed below
Sorting:
- (ICLR 2025) The Official Code Repository for GUI-World.☆61Updated 6 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆69Updated last week
- ☆46Updated this week
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆46Updated 4 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆19Updated 3 weeks ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆63Updated last month
- Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆85Updated 3 weeks ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆88Updated last month
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆15Updated this week
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆40Updated 3 weeks ago
- ☆20Updated 2 months ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆52Updated last month
- "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆78Updated 3 months ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆78Updated 3 weeks ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆93Updated this week
- Repo for "Z1: Efficient Test-time Scaling with Code"☆63Updated 3 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆18Updated 3 weeks ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆74Updated 3 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆57Updated 8 months ago
- ☆48Updated last month
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆190Updated 3 weeks ago
- ☆30Updated 2 months ago
- ☆47Updated 5 months ago
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Kn…☆19Updated 6 months ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆52Updated 7 months ago
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆26Updated this week
- ☆46Updated 2 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search"☆25Updated 2 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆71Updated 7 months ago
- LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models☆19Updated 3 months ago