trestad / Factual-Recall-Mechanism
The code for paper Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models.
☆13Updated 9 months ago
Alternatives and similar repositories for Factual-Recall-Mechanism:
Users that are interested in Factual-Recall-Mechanism are comparing it to the libraries listed below
- Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'☆12Updated 5 months ago
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆18Updated last year
- A Survey on the Honesty of Large Language Models☆51Updated last month
- ☆11Updated 9 months ago
- [ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View☆43Updated 3 months ago
- ☆25Updated 8 months ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆58Updated last year
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆26Updated 2 months ago
- ☆28Updated 3 months ago
- ☆70Updated 3 weeks ago
- The official implementation of "ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization …☆14Updated 11 months ago
- ☆24Updated 3 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆30Updated 6 months ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆21Updated 6 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆50Updated 9 months ago
- Code for Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities (NeurIPS'24)☆14Updated last month
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆63Updated 6 months ago
- ☆49Updated 2 months ago
- Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆16Updated last month
- ☆18Updated last month
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆70Updated 7 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆139Updated this week
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆23Updated 9 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆102Updated 9 months ago
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆29Updated 3 weeks ago
- ☆36Updated last year
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆39Updated 2 months ago
- Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆66Updated 3 months ago
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆43Updated 2 months ago
- Evaluating the Ripple Effects of Knowledge Editing in Language Models☆53Updated 9 months ago