Neph0s / InCharacter
Official code for the paper: InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews (previously: Do Role-Playing Chatbots Capture the Character Personalities? Assessing Personality Traits for Role-Playing Chatbots)
☆67Updated 4 months ago
Alternatives and similar repositories for InCharacter:
Users that are interested in InCharacter are comparing it to the libraries listed below
- repository for CharacterChat, a personalized social support system☆66Updated 7 months ago
- RoleInteract: Evaluating the Social Interaction of Role-Playing Agents☆54Updated 4 months ago
- Code and Data for EMNLP 2024 Paper "Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent"☆112Updated 2 months ago
- Awesome papers for role-playing with language models☆165Updated 3 months ago
- A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…☆183Updated 8 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆83Updated last year
- ☆48Updated 11 months ago
- Unofficial implementation of AlpaGasus☆90Updated last year
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆72Updated 8 months ago
- This is the official repository for the paper "EmoBench: Evaluating the Emotional Intelligence of Large Language Models"☆57Updated 11 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆57Updated 9 months ago
- ☆218Updated 3 months ago
- Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"☆49Updated last year
- Generate multi-round conversation roleplay data based on self-instruct and evol-instruct.☆120Updated last month
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆65Updated 2 months ago
- MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation☆21Updated 10 months ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆34Updated 7 months ago
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆64Updated this week
- A Bilingual Role Evaluation Benchmark for Large Language Models☆37Updated last year
- ☆72Updated 4 months ago
- Reformatted Alignment☆114Updated 4 months ago
- [EMNLP 2024] Ask-before-Plan: Proactive Language Agents for Real-World Planning☆18Updated 3 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆68Updated last month
- ☆93Updated last year
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆116Updated 3 months ago
- Logiqa2.0 dataset - logical reasoning in MRC and NLI tasks☆86Updated last year
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆52Updated 10 months ago
- The official repository of the Omni-MATH benchmark.☆71Updated last month
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆44Updated 7 months ago
- Codebase for LLM story generation; updated version of https//github.com/yangkevin2/doc-story-generation☆76Updated last year