OSU-NLP-Group / SeeActChromeExtensionLinks
☆16Updated 8 months ago
Alternatives and similar repositories for SeeActChromeExtension
Users that are interested in SeeActChromeExtension are comparing it to the libraries listed below
Sorting:
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆62Updated 9 months ago
- Run SWE-bench evaluations remotely☆42Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆36Updated last year
- ☆67Updated 5 months ago
- ☆11Updated 10 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆53Updated 9 months ago
- ☆41Updated last year
- Enhancement in Multimodal Representation Learning.☆40Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆29Updated 9 months ago
- ☆30Updated last year
- ☆86Updated last year
- ☆56Updated 2 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated 8 months ago
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆40Updated 9 months ago
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆20Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 10 months ago
- ☆47Updated last year
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆54Updated 2 months ago
- ☆23Updated last year
- Generate High Quality textual or multi-modal datasets with Agents☆18Updated 2 years ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆55Updated 6 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 10 months ago
- Verifiers for LLM Reinforcement Learning☆74Updated 5 months ago
- ☆40Updated 9 months ago
- The Library for LLM-based multi-agent applications☆90Updated 2 months ago
- ☆50Updated 4 months ago
- ☆13Updated 4 months ago
- ☆35Updated 4 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 10 months ago
- ScreenSuite - The most comprehensive benchmarking suite for GUI Agents!☆115Updated last month