OSU-NLP-Group / SeeActChromeExtension
☆15Updated 4 months ago
Alternatives and similar repositories for SeeActChromeExtension
Users that are interested in SeeActChromeExtension are comparing it to the libraries listed below
Sorting:
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 5 months ago
- ☆20Updated 2 months ago
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆16Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- ☆40Updated 9 months ago
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆51Updated 2 months ago
- ☆64Updated last month
- ☆27Updated 10 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated last month
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆20Updated last month
- ☆13Updated 5 months ago
- ☆32Updated last year
- ☆41Updated 5 months ago
- CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments☆55Updated 2 months ago
- ☆15Updated last month
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆25Updated 5 months ago
- ☆38Updated 4 months ago
- LLMs as Collaboratively Edited Knowledge Bases☆45Updated last year
- ☆50Updated 5 months ago
- 🌟 SwarmAgent: A framework for simulating social group dynamics using multi-agent collaboration, aiding insights into collective behavior…☆11Updated last year
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆17Updated 6 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- ☆42Updated last month
- ☆25Updated 7 months ago
- ☆18Updated 7 months ago
- Verifiers for LLM Reinforcement Learning☆50Updated last month
- The Swarm Ecosystem☆20Updated 9 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- Measuring and Controlling Persona Drift in Language Model Dialogs☆17Updated last year
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24Updated 2 years ago