jonathanmli/Avalon-LLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jonathanmli/Avalon-LLM)

jonathanmli / Avalon-LLM

This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'

☆159

Alternatives and similar repositories for Avalon-LLM

Users that are interested in Avalon-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HenryCai11 / LLM-Self-Control
View on GitHub
The official repo of paper "Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller"
☆18Aug 13, 2024Updated last year
StringNLPLAB / MGS
View on GitHub
Repository for the paper "Advancing General-Purpose Reasoning Models with Modular Gradient Surgery"
☆21Mar 15, 2026Updated 4 months ago
Shenzhi-Wang / recon
View on GitHub
The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)
☆15Aug 12, 2024Updated last year
facebookresearch / decrypto
View on GitHub
Implementation of the Decrypto benchmark for multi-agent reasoning and theory of mind.
☆22Jan 19, 2026Updated 6 months ago
3DAgentWorld / LLM-Game-Agent
View on GitHub
☆24Oct 13, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Detry322 / DeepRole
View on GitHub
The code used to power DeepRole
☆38Nov 21, 2022Updated 3 years ago
SalesforceAIResearch / LaTRO
View on GitHub
☆127Jun 2, 2026Updated last month
CUHK-ARISE / GAMABench
View on GitHub
Code and data for the paper: Competing Large Language Models in Multi-Agent Gaming Environments
☆98Jan 26, 2026Updated 6 months ago
SalesforceAIResearch / swecomm
View on GitHub
☆28Jun 2, 2026Updated last month
THUDM / ReST-MCTS
View on GitHub
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆709Jan 20, 2025Updated last year
eigent-ai / toolathlon_gym
View on GitHub
Toolathlon-Gym for testing AI agents real-world tool-use capabilities across diverse MCP servers.
☆140Jul 22, 2026Updated last week
mindagent / mindagent
View on GitHub
☆102Jun 12, 2024Updated 2 years ago
siyuyuan / evoagent
View on GitHub
Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"
☆167Oct 19, 2024Updated last year
huiwy / reflection-on-trees
View on GitHub
☆14May 9, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
THUDM / AgentBench
View on GitHub
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
☆3,611Feb 8, 2026Updated 5 months ago
rohinmanvi / Capability-Aware-and-Mid-Generation-Self-Evaluations
View on GitHub
☆21Jul 25, 2025Updated last year
salavi / Clever_Hans_or_N-ToM
View on GitHub
☆12May 6, 2024Updated 2 years ago
daeh / computed-appraisals
View on GitHub
Computed Appraisals Model. Code and data for the 2023 paper, "Emotion prediction as computation over a generative theory of mind"
☆13Jun 12, 2023Updated 3 years ago
HypherX / Evolution-Analysis
View on GitHub
☆25Dec 13, 2024Updated last year
portal-cornell / muCode
View on GitHub
☆33Oct 2, 2025Updated 9 months ago
iwangjian / Color4Dial
View on GitHub
Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue (ACL Findings 2023)
☆21Nov 10, 2025Updated 8 months ago
chen-judge / SPC
View on GitHub
[NeurIPS 25] The official implementation of SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
☆30Sep 21, 2025Updated 10 months ago
jordddan / GameEval
View on GitHub
Using conversational games to evaluate powerful LLMs
☆18Sep 3, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cathyxl / MAgIC
View on GitHub
☆43Nov 13, 2024Updated last year
princeton-nlp / LM-Science-Tutor
View on GitHub
☆50Aug 6, 2024Updated last year
SIMONLQY / RethinkMCTS
View on GitHub
☆34Oct 2, 2024Updated last year
likenneth / dialogue_action_token
View on GitHub
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
☆31Jun 27, 2024Updated 2 years ago
eryk-mazus / sigh
View on GitHub
Seamless Voice Interactions with LLMs
☆12Oct 28, 2023Updated 2 years ago
AlaaLab / pathologist-in-the-loop
View on GitHub
[ NeurIPS 2023 ] Official Codebase for "Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback"
☆20Oct 19, 2023Updated 2 years ago
zhangxy-2019 / sgp-tod
View on GitHub
☆14Aug 21, 2025Updated 11 months ago
iQua / llmpebase
View on GitHub
This is a unified platform for implementing and evaluating test-time reasoning mechanisms in Large Language Models (LLMs).
☆18Jan 16, 2025Updated last year
S-Abdelnabi / LLM-Deliberation
View on GitHub
Code for our NeurIPS'24 Dataset and Benchmark paper: Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiatio…
☆54Nov 11, 2024Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
thunlp / AutoForm
View on GitHub
Code for paper "Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication"
☆23Mar 30, 2024Updated 2 years ago
aogara-ds / hoodwinked
View on GitHub
Text-based game of lies and deceit, made for language models.
☆32Aug 25, 2023Updated 2 years ago
technion-cs-nlp / hallucination-mitigation
View on GitHub
☆23Dec 17, 2024Updated last year
Holmeswww / SPRING
View on GitHub
☆15Mar 26, 2024Updated 2 years ago
MAGAer13 / DeCapBench
View on GitHub
Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)
☆14Mar 6, 2025Updated last year
SALT-NLP / PersuationGames
View on GitHub
[ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…
☆16Feb 22, 2025Updated last year
waterhorse1 / Natural-language-RL
View on GitHub
Natural Language Reinforcement Learning
☆101Jul 30, 2025Updated 11 months ago