boluoweifenda / werewolfLinks
☆24Updated last year
Alternatives and similar repositories for werewolf
Users that are interested in werewolf are comparing it to the libraries listed below
Sorting:
- A collection of LLM with RL papers☆278Updated last year
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆193Updated last year
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆37Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆326Updated last year
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…☆286Updated 2 years ago
- This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Met…☆154Updated last year
- ☆51Updated 4 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆193Updated 5 months ago
- ☆23Updated 11 months ago
- ☆342Updated 4 months ago
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆384Updated 8 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆282Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆148Updated 7 months ago
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆351Updated 2 years ago
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …☆141Updated last year
- Benchmarking LLMs' Gaming Ability in Multi-Agent Environments☆88Updated 5 months ago
- The Code Repo for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization☆121Updated last year
- AI Alignment: A Comprehensive Survey☆134Updated last year
- ☆19Updated 9 months ago
- ☆95Updated last year
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.☆295Updated 2 months ago
- RLHF implementation details of OAI's 2019 codebase☆191Updated last year
- Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024☆139Updated 7 months ago
- Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…☆144Updated 7 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆267Updated last year
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆190Updated 6 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆246Updated 5 months ago
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated 2 years ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆137Updated 2 months ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆56Updated last year