myracheng / lm_caricature
code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations
☆10Updated last year
Alternatives and similar repositories for lm_caricature
Users that are interested in lm_caricature are comparing it to the libraries listed below
Sorting:
- This repository contains data, code and models for contextual noncompliance.☆22Updated 10 months ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆25Updated 3 months ago
- ☆22Updated 7 months ago
- Teaching Models to Express Their Uncertainty in Words☆39Updated 2 years ago
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆24Updated 9 months ago
- ☆36Updated 2 years ago
- ☆50Updated last year
- ☆29Updated last year
- ☆78Updated 2 years ago
- Findings of ACL'2023: Optimizing Test-Time Query Representations for Dense Retrieval☆30Updated last year
- Data Valuation on In-Context Examples (ACL23)☆23Updated 4 months ago
- Supporting code for ReCEval paper☆28Updated 8 months ago
- ☆54Updated last year
- code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…☆11Updated 6 months ago
- A framework to train language models to learn invariant representations.☆14Updated 3 years ago
- ☆9Updated last year
- ☆34Updated 3 years ago
- Restore safety in fine-tuned language models through task arithmetic☆28Updated last year
- ☆15Updated last year
- ☆62Updated 2 years ago
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆24Updated 11 months ago
- ☆29Updated last year
- ☆50Updated last year
- A codebase for ACL 2023 paper: Mitigating Label Biases for In-context Learning☆10Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆57Updated 5 months ago
- ☆26Updated 3 months ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆42Updated 2 years ago
- ☆44Updated 8 months ago
- ☆26Updated last year
- Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"☆21Updated 4 years ago