rhubarbwu / linguistic-collapseLinks
Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]
☆16Updated 6 months ago
Alternatives and similar repositories for linguistic-collapse
Users that are interested in linguistic-collapse are comparing it to the libraries listed below
Sorting:
- ☆57Updated 2 years ago
 - ☆98Updated 2 years ago
 - ☆78Updated 3 years ago
 - ☆29Updated last year
 - ☆67Updated 2 years ago
 - ☆108Updated 8 months ago
 - [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆84Updated last year
 - ☆241Updated last year
 - A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆83Updated 7 months ago
 - [ICLR 2025] General-purpose activation steering library☆115Updated last month
 - ☆46Updated last year
 - ☆38Updated 2 years ago
 - AI Logging for Interpretability and Explainability🔬☆130Updated last year
 - ☆236Updated last year
 - Function Vectors in Large Language Models (ICLR 2024)☆181Updated 6 months ago
 - ☆57Updated 3 months ago
 - DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆76Updated last year
 - The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆97Updated 4 years ago
 - [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆81Updated 10 months ago
 - Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆105Updated 2 years ago
 - Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆136Updated 4 months ago
 - A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆78Updated 2 years ago
 - [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆66Updated 11 months ago
 - NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆41Updated 8 months ago
 - ☆63Updated 7 months ago
 - Sparse probing paper full code.☆63Updated last year
 - `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆92Updated 2 weeks ago
 - ☆51Updated last year
 - Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆56Updated this week
 - EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975☆38Updated last year