doslim / Evaluate-the-Opinion-Leadership-of-LLMs
Evaluate the Opinion Leadership of LLMs in the Werewolf Game
☆9Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for Evaluate-the-Opinion-Leadership-of-LLMs
- A lightweight script for processing HTML page to markdown format with support for code blocks☆73Updated 7 months ago
- This is a collection of resources for computer-use agents, including videos, blogs, papers, and projects.☆102Updated 2 weeks ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆60Updated last month
- Evaluation for AI apps and agent☆35Updated 10 months ago
- Reformatted Alignment☆112Updated last month
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆50Updated 3 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆48Updated 5 months ago
- ☆78Updated 2 months ago
- ☆78Updated 7 months ago
- ☆83Updated 7 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆76Updated 9 months ago
- This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"☆86Updated last month
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- The Official Code Repository for GUI-World.☆41Updated 3 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆30Updated 9 months ago
- Control LLM generation format efficiently. A simple version of microsoft/aici in vllm and transformers☆12Updated 5 months ago
- ☆51Updated 3 months ago
- ☆54Updated last month
- [EMNLP 2024] Ask-before-Plan: Proactive Language Agents for Real-World Planning☆13Updated 3 weeks ago
- ☆17Updated 4 months ago
- ☆18Updated 3 weeks ago
- ☆35Updated 2 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆56Updated 8 months ago
- Hammer: Robust Function-Calling for On-Device Language Models via Function Masking☆33Updated this week
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆46Updated 2 weeks ago
- ☆19Updated 5 months ago
- A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆31Updated this week
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆99Updated this week
- ☆48Updated 8 months ago