microsoft / competeai
[ICML 2024 Oral] A framework for society simulation that supports complex simulation, for example: multi-scene.
☆39Updated last month
Related projects: ⓘ
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆73Updated 2 months ago
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆39Updated 7 months ago
- Official repository for paper "GTA: A Benchmark for General Tool Agents"☆28Updated 2 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆29Updated 7 months ago
- Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization☆96Updated 4 months ago
- Flow of Reasoning: Efficient Training of LLM Policy with Diverse Thinking☆25Updated this week
- A task generation and model evaluation system.☆51Updated 2 weeks ago
- [ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View☆89Updated 4 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆55Updated 3 months ago
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆49Updated 2 months ago
- GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations☆43Updated 2 weeks ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆89Updated 4 months ago
- A curated paper list on LLM reasoning.☆61Updated 6 months ago
- ☆24Updated last week
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- The Official Code Repository for GUI-World.☆33Updated last month
- This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"☆79Updated last month
- official repo for the paper "Learning From Mistakes Makes LLM Better Reasoner"☆48Updated 9 months ago
- The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"☆30Updated last month
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆84Updated 11 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆81Updated last month
- Towards Large Multimodal Models as Visual Foundation Agents☆87Updated 3 weeks ago
- augmented LLM with self reflection☆80Updated 10 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆39Updated 7 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆73Updated 7 months ago
- [ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning☆162Updated 5 months ago
- DSBench: How Far are Data Science Agents Becoming Data Science Experts?☆20Updated this week
- ☆31Updated 8 months ago
- ☆13Updated this week
- ☆76Updated 4 months ago