SALT-NLP / collaborative-gym
Framework and toolkits for building and evaluating collaborative agents that can work together with humans.
☆16Updated this week
Alternatives and similar repositories for collaborative-gym:
Users that are interested in collaborative-gym are comparing it to the libraries listed below
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆55Updated 8 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆107Updated last month
- ☆48Updated last month
- ☆11Updated last month
- This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"☆90Updated last month
- Exploring Model Kinship for Merging Large Language Models☆22Updated 2 months ago
- ☆116Updated 3 months ago
- ☆90Updated this week
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆77Updated 3 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆118Updated 5 months ago
- ☆29Updated this week
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆35Updated 3 weeks ago
- ☆29Updated this week
- ☆23Updated 4 months ago
- [NeurIPS 2023] PyTorch code for Can Language Models Teach? Teacher Explanations Improve Student Performance via Theory of Mind☆67Updated last year
- UGround: Universal GUI Visual Grounding for GUI Agents☆141Updated this week
- The Official Code Repository for GUI-World.☆44Updated last month
- augmented LLM with self reflection☆109Updated last year
- 🌐 WebWaker: Benchmarking LLMs in Web Traversal☆54Updated this week
- ☆120Updated 7 months ago
- This the implementation of LeCo☆30Updated 6 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆44Updated last month
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆136Updated last month
- Benchmarking LLMs with Challenging Tasks from Real Users☆206Updated 2 months ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆40Updated last month
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆53Updated 10 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆79Updated 11 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment