PERSONA-bench / PERSONALinks
LLM Benchmark
☆37Updated 6 months ago
Alternatives and similar repositories for PERSONA
Users that are interested in PERSONA are comparing it to the libraries listed below
Sorting:
- ☆30Updated last year
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆63Updated last year
- Toolkit for evaluating the trustworthiness of generative foundation models.☆123Updated 3 months ago
- A curated list of resources for activation engineering☆119Updated 2 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆145Updated 5 months ago
- Can Knowledge Editing Really Correct Hallucinations? (ICLR 2025)☆28Updated 4 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆86Updated 10 months ago
- [ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models☆59Updated last year
- ☆140Updated 3 months ago
- This repo contains the source code for reproducing the experimental results in semantic density paper (Neurips 2024)☆17Updated 2 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆90Updated last year
- ☆189Updated 7 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆73Updated 6 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆85Updated 10 months ago
- Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.☆53Updated 2 years ago
- ☆135Updated 9 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆65Updated last year
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Updated last year
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆117Updated last year
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Updated 8 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆235Updated last week
- ☆57Updated 5 months ago
- ☆51Updated 10 months ago
- This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers…☆96Updated last year
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆86Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆63Updated last year
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆18Updated 9 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆72Updated 5 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆30Updated 5 months ago
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆70Updated 11 months ago