boson-ai / RPBench-AutoLinks

An automated pipeline for evaluating LLMs for role-playing.

☆201

Alternatives and similar repositories for RPBench-Auto

Users that are interested in RPBench-Auto are comparing it to the libraries listed below

Sorting:

modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆379Updated this week
HarderThenHarder / RLLoggingBoard
A visuailzation tool to make deep understaning and easier debugging for RLHF training.
☆261Updated 8 months ago
step-law / steplaw
☆205Updated this week
a-m-team / a-m-models
a-m-team's exploration in large language modeling
☆190Updated 5 months ago
Neph0s / CoSER
Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"
☆135Updated 4 months ago
OFA-Sys / InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
☆278Updated 2 years ago
open-compass / T-Eval
[ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step
☆294Updated last year
Strivin0311 / long-llms-learning
A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks
☆268Updated last year
InternLM / OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆190Updated 7 months ago
TemporaryLoRA / Temp-LoRA
☆118Updated last year
GAIR-NLP / ProX
[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
☆263Updated 3 months ago
alibaba / ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
☆433Updated last week
QwenLM / AutoIF
☆312Updated last year
bytarnish / AGILE
☆161Updated 9 months ago
THUDM / LongAlign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
☆257Updated 10 months ago
SuperGPQA / SuperGPQA
☆169Updated 6 months ago
morecry / CharacterEval
☆271Updated 5 months ago
GAIR-NLP / abel
SOTA Math Opensource LLM
☆333Updated last year
modelscope / easydistill
a toolkit on knowledge distillation for large language models
☆181Updated 2 weeks ago
LCLM-Horizon / A-Comprehensive-Survey-For-Long-Context-Language-Modeling
A Comprehensive Survey on Long Context Language Modeling
☆197Updated 3 months ago
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆744Updated 2 months ago
wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey
The related works and background techniques about Openai o1
☆223Updated 9 months ago
OpenBMB / UltraEval
[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.
☆251Updated last year
nuochenpku / Awesome-Role-Play-Papers
Awesome papers for role-playing with language models
☆207Updated 11 months ago
the-seeds / LLaMA-Factory-Doc
LLaMA Factory Document
☆152Updated 2 weeks ago
OFA-Sys / Ditto
A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters…
☆204Updated last year
OpenBMB / Eurus
☆320Updated last year
sail-sg / oat-zero
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
☆247Updated 6 months ago
qiancheng0 / ToolRL
☆367Updated 2 weeks ago
OFA-Sys / gsm8k-ScRel
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆266Updated last year