MurtyShikhar / NNetnavLinks
Interaction-first method for generating demonstrations for web-agents on any website
☆44Updated 4 months ago
Alternatives and similar repositories for NNetnav
Users that are interested in NNetnav are comparing it to the libraries listed below
Sorting:
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆61Updated 8 months ago
- ☆133Updated 5 months ago
- Analysis code for paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"☆47Updated 3 weeks ago
- ☆81Updated 10 months ago
- ☆56Updated 2 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆88Updated 11 months ago
- accompanying material for sleep-time compute paper☆107Updated 4 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆127Updated last year
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆95Updated 4 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆157Updated 6 months ago
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆89Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆54Updated 3 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated 7 months ago
- AWM: Agent Workflow Memory☆306Updated 7 months ago
- ☆84Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆109Updated 2 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆76Updated 8 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆92Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆179Updated 5 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆103Updated 4 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆115Updated 11 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆88Updated this week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆74Updated 5 months ago
- II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆27Updated 4 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆150Updated 6 months ago
- WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?☆201Updated this week
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆84Updated 5 months ago
- Run SWE-bench evaluations remotely☆40Updated 2 weeks ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆69Updated last year
- Beating the GAIA benchmark with Transformers Agents. 🚀☆133Updated 6 months ago