Framework and toolkits for building and evaluating collaborative agents that can work together with humans.
☆129Apr 30, 2026Updated this week
Alternatives and similar repositories for collaborative-gym
Users that are interested in collaborative-gym are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆29Nov 27, 2025Updated 5 months ago
- code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations☆11Oct 13, 2023Updated 2 years ago
- AutoLibra: Metric Induction for Agents from Open-Ended Human Feedback☆19Apr 23, 2026Updated last week
- This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box☆18Dec 19, 2024Updated last year
- A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM …☆45Mar 4, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Data and Code for StructuredRegex.☆14Nov 16, 2023Updated 2 years ago
- ☆10Jun 15, 2024Updated last year
- An Autonomous Curriculum Reinforcement Learning framework that steers agents to continually learn in specific environments with zero huma…☆30Feb 25, 2026Updated 2 months ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆40May 26, 2025Updated 11 months ago
- Azure Command-Line Interface☆15Mar 26, 2026Updated last month
- ☆11Jan 3, 2024Updated 2 years ago
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆51Dec 23, 2024Updated last year
- PRODIGy is a collection of dialogues in which each conversation is aligned with speaker profile representations.☆19Jan 8, 2025Updated last year
- Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".☆44Nov 23, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The official code for NAACL 2024 paper: $E^5$: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, …☆15Jun 23, 2024Updated last year
- [COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees☆31Jul 11, 2025Updated 9 months ago
- Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent☆16Sep 8, 2022Updated 3 years ago
- ☆15Mar 26, 2024Updated 2 years ago
- Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals☆11Jan 8, 2026Updated 3 months ago
- Reproducible Language Agent Research☆35Jun 25, 2025Updated 10 months ago
- Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.☆27Updated this week
- [ICML'21 Oral] Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding☆14Jun 10, 2021Updated 4 years ago
- ☆61Sep 24, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A new tiny QR code for robot cameras(机器人视觉-微小二维码的识别与定位)☆10Jul 26, 2021Updated 4 years ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆266May 5, 2025Updated last year
- The official data and code for EMNLP 2023 main conference paper: CRT-QA: A Dataset of Complex Reasoning Question Answering over Tabular D…☆13May 19, 2025Updated 11 months ago
- Official Repo for CRMArena and CRMArena-Pro☆136Apr 14, 2026Updated 3 weeks ago
- ☆12Feb 4, 2025Updated last year
- ☆19Jan 3, 2025Updated last year
- A Deep RL Wordle Bot☆12Dec 6, 2022Updated 3 years ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆672Jul 29, 2025Updated 9 months ago
- Code for "Proposition-Level Clustering for Multi-Document Summarization" paper☆10Apr 5, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This repo investigates LLMs' tendency to exhibit acquiescence bias in sequential QA interactions. Includes evaluation methods, datasets, …☆39Apr 24, 2026Updated last week
- ☆49Sep 7, 2025Updated 7 months ago
- Implementation of SOAR☆52Sep 17, 2025Updated 7 months ago
- ☆38May 15, 2025Updated 11 months ago
- ☆12Jan 4, 2024Updated 2 years ago
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆300Jan 23, 2026Updated 3 months ago
- ☆15Feb 21, 2024Updated 2 years ago