rhubarbwu / linguistic-collapseLinks

Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]

☆13

Alternatives and similar repositories for linguistic-collapse

Users that are interested in linguistic-collapse are comparing it to the libraries listed below

Sorting:

mmatena / model_merging
☆69Updated 3 years ago
KihoPark / linear_rep_geometry
☆95Updated 4 months ago
gortizji / tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".
☆102Updated 2 years ago
ykwon0407 / DataInf
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)
☆70Updated 8 months ago
r-three / mats
☆30Updated 11 months ago
roeehendel / icl_task_vectors
☆95Updated last year
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆72Updated 3 months ago
IBM / activation-steering
General-purpose activation steering library
☆78Updated last month
wesg52 / sparse-probing-paper
Sparse probing paper full code.
☆58Updated last year
VectorInstitute / vectorlm
LLM finetuning in resource-constrained environments.
☆47Updated last year
locuslab / acr-memorization
☆34Updated 6 months ago
javiferran / sae_entities
☆44Updated 3 months ago
logix-project / logix
AI Logging for Interpretability and Explainability🔬
☆123Updated last year
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆60Updated 6 months ago
MadryLab / journey-TRAK
Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"
☆22Updated last year
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆77Updated 6 months ago
pomonam / kronfluence
Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature
☆156Updated this week
TRAIS-Lab / dattri
`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.
☆77Updated 2 weeks ago
montemac / activation_additions
Algebraic value editing in pretrained language models
☆65Updated last year
explanare / ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆47Updated 8 months ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆67Updated 2 months ago
DeqingFu / transformers-icl-second-order
Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…
☆16Updated 7 months ago
google-research / jax-influence
☆60Updated 3 years ago
UKPLab / iclr2024-model-merging
This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.
☆27Updated last year
yfqiu-nlp / sea-llm
Code for the paper "Spectral Editing of Activations for Large Language Model Alignments"
☆24Updated 6 months ago
rishub-tamirisa / tamper-resistance
[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
☆58Updated 2 weeks ago
Nix07 / finetuning
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…
☆27Updated last year
tml-epfl / long-is-more-for-alignment
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]
☆17Updated last year
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆93Updated 3 years ago
fc2869 / lo-fit
LoFiT: Localized Fine-tuning on LLM Representations
☆39Updated 5 months ago