HanjieChen / Reading-List
☆42Updated last year
Alternatives and similar repositories for Reading-List
Users that are interested in Reading-List are comparing it to the libraries listed below
Sorting:
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆57Updated 5 months ago
- ☆165Updated 10 months ago
- ☆132Updated last year
- ☆23Updated 5 months ago
- awesome SAE papers☆27Updated 2 months ago
- ☆50Updated last year
- ☆31Updated 2 months ago
- A resource repository for representation engineering in large language models☆120Updated 6 months ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆72Updated 2 months ago
- ☆161Updated 5 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆65Updated 7 months ago
- ☆14Updated 4 months ago
- ☆42Updated 5 months ago
- ☆36Updated 2 years ago
- LoFiT: Localized Fine-tuning on LLM Representations☆38Updated 4 months ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆110Updated 8 months ago
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆32Updated 6 months ago
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆81Updated 8 months ago
- ☆29Updated last year
- Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization☆22Updated 9 months ago
- ☆10Updated 2 months ago
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆136Updated 5 months ago
- Source code of our paper MIND, ACL 2024 Long Paper☆40Updated 11 months ago
- ☆94Updated last year
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆13Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆59Updated last year
- Code for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"☆168Updated last year
- ☆75Updated 4 months ago
- A curated list of resources for activation engineering☆74Updated last week
- [NeurIPS 2024] How do Large Language Models Handle Multilingualism?☆34Updated 6 months ago