A Bilingual Role Evaluation Benchmark for Large Language Models
☆43Jan 9, 2024Updated 2 years ago
Alternatives and similar repositories for RoleEval
Users that are interested in RoleEval are comparing it to the libraries listed below
Sorting:
- SuperCLUE-Role中文原生角色扮演测评基准☆36Apr 3, 2024Updated last year
- A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…☆211May 28, 2024Updated last year
- RoleInteract: Evaluating the Social Interaction of Role-Playing Agents☆67Oct 12, 2024Updated last year
- RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models☆520Oct 11, 2024Updated last year
- ☆29Aug 9, 2023Updated 2 years ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆81Sep 28, 2023Updated 2 years ago
- Repository for the CODAH dataset☆22Oct 29, 2022Updated 3 years ago
- Official repository for the Findings of ACL 2023 paper "AugESC: Dialogue Augmentation with Large Language Models for Emotional Support Co…☆20May 16, 2023Updated 2 years ago
- Awesome papers for role-playing with language models☆218Nov 3, 2024Updated last year
- Code and data for the paper: On the Reliability of Psychological Scales on Large Language Models☆30Dec 15, 2025Updated 2 months ago
- Code and datasets for "Character-LLM: A Trainable Agent for Role-Playing"☆610Oct 29, 2024Updated last year
- [EMNLP'24] CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models☆490Oct 2, 2025Updated 5 months ago
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.☆554Feb 12, 2024Updated 2 years ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆63Sep 24, 2024Updated last year
- 面向中文大模型价值观的评估与对齐研究☆554Jul 20, 2023Updated 2 years ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆35Apr 17, 2025Updated 10 months ago
- Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner☆30Jun 27, 2024Updated last year
- Based on the Evol-character framework and OpenAI API, enabling fine-grained role-playing data generation 🎭🧩.☆31Feb 1, 2024Updated 2 years ago
- Code and Data for EMNLP 2024 Paper "Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent"☆136Jul 23, 2025Updated 7 months ago
- [NIPS2023] RRHF & Wombat☆809Sep 22, 2023Updated 2 years ago
- This the implementation of LeCo☆31Jan 20, 2025Updated last year
- Generate multi-round conversation roleplay data based on self-instruct and evol-instruct.☆137Jan 9, 2025Updated last year
- ☆12Sep 25, 2023Updated 2 years ago
- 🧠 A sample app to integrate react-native and open ai☆11Jan 1, 2023Updated 3 years ago
- Go SDK for the Bare Metal Cloud API☆14Dec 20, 2025Updated 2 months ago
- 中文大语言模型评测第三期☆35Dec 30, 2025Updated 2 months ago
- Official code for the paper: InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews (previo…☆91May 27, 2025Updated 9 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆85Feb 5, 2024Updated 2 years ago
- A Dataset for Multi-Turn Dialogue Reasoning☆332Oct 7, 2020Updated 5 years ago
- Dataset and code for “Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding, EMNLP 201…☆39Dec 16, 2020Updated 5 years ago
- Generative Judge for Evaluating Alignment☆250Jan 18, 2024Updated 2 years ago
- ☆12Dec 26, 2023Updated 2 years ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- ☆13Nov 5, 2024Updated last year
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- CVPR 2023: PAniC-3D, Vtubers dataset downloader☆13Apr 22, 2023Updated 2 years ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- ☆12Jan 11, 2026Updated last month