seacowx / OpenToM
The official repository of the OpenToM dataset
☆17Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for OpenToM
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆62Updated 5 months ago
- ☆78Updated 11 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆75Updated last month
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with L…☆42Updated last year
- Repository for paper Tools Are Instrumental for Language Agents in Complex Environments☆32Updated last month
- ☆22Updated 2 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆68Updated last year
- ☆46Updated last week
- ☆20Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆49Updated 8 months ago
- ☆41Updated 2 weeks ago
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆107Updated last year
- Governance of the Commons Simulation (GovSim)☆21Updated 4 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆87Updated last year
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆40Updated 9 months ago
- ☆48Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆20Updated last week
- Metacognitive Prompting Improves Understanding in Large Language Models (NAACL 2024)☆27Updated last year
- ☆112Updated last month
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆41Updated last month
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆76Updated 9 months ago
- ☆74Updated 3 weeks ago
- Camel-Coder: Collaborative task completion with multiple agents. Role-based prompts, intervention mechanism, and thoughtful suggestions☆33Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆28Updated 8 months ago
- ☆35Updated last year
- [NeurIPS 2023] PyTorch code for Can Language Models Teach? Teacher Explanations Improve Student Performance via Theory of Mind☆67Updated 11 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated 9 months ago
- ☆24Updated last year
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆46Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆30Updated 9 months ago