BunsenFeng / modular_pluralismLinks

Modular Pluralism @ EMNLP 2024

☆21

Alternatives and similar repositories for modular_pluralism

Users that are interested in modular_pluralism are comparing it to the libraries listed below

Sorting:

balevinstein / Probes
☆57Updated 2 years ago
jianggy / MPI
This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models
☆56Updated 2 years ago
dannyallover / overthinking_the_truth
☆29Updated last year
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆84Updated 9 months ago
tatsu-lab / opinions_qa
☆116Updated last year
lorenzkuhn / semantic_uncertainty
☆180Updated last year
deeplearning-wisc / args
☆46Updated last year
HannahKirk / prism-alignment
The Prism Alignment Project
☆86Updated last year
evandez / REMEDI
Inspecting and Editing Knowledge Representations in Language Models
☆119Updated 2 years ago
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆82Updated 11 months ago
zlin7 / UQ-NLG
☆103Updated last year
epfl-dlab / llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
☆80Updated last year
deeplearning-wisc / picle
Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)
☆26Updated last year
roeehendel / icl_task_vectors
☆102Updated 2 years ago
chrisliu298 / awesome-representation-engineering
A resource repository for representation engineering in large language models
☆142Updated last year
xlang-ai / icl-selective-annotation
[ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"
☆111Updated 2 years ago
BYU-PCCL / leveraging-llms-for-mcqa
This is the code for the ICLR 2023 paper "Leveraging Large Language Models for Multiple Choice Question Answering."
☆41Updated 2 years ago
joeljang / RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
☆111Updated 2 years ago
HLTCHKUST / UniVaR
Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)
☆23Updated last year
SALT-NLP / CultureBank
☆47Updated 2 months ago
princeton-nlp / MQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
☆118Updated last year
aviclu / ffn-values
☆67Updated 2 years ago
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆67Updated last year
nyu-mll / BBQ
Repository for the Bias Benchmark for QA dataset.
☆132Updated last year
EmpathYang / ADEPT
Source code and data for ADEPT: A DEbiasing PrompT Framework (AAAI-23).
☆15Updated 11 months ago
princeton-nlp / unintentional-unalignment
[ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
☆31Updated 10 months ago
neubig / research-career-tools
☆165Updated last year
jamqd / Group-Preference-Optimization
☆22Updated last year
flamewei123 / DEPN
☆25Updated last year
ericwtodd / function_vectors
Function Vectors in Large Language Models (ICLR 2024)
☆186Updated 7 months ago