Luckfort / CDLinks

[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

☆79

Alternatives and similar repositories for CD

Users that are interested in CD are comparing it to the libraries listed below

Sorting:

MingLiiii / Layer_Gradient
[ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆70Updated last month
zjunlp / KnowledgeCircuits
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
☆151Updated 5 months ago
GeniusHTX / TALE
☆126Updated 2 months ago
ericwtodd / function_vectors
Function Vectors in Large Language Models (ICLR 2024)
☆175Updated 3 months ago
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆112Updated last month
tatsu-lab / test_set_contamination
☆38Updated last year
clinicalml / co-llm
Co-LLM: Learning to Decode Collaboratively with Multiple Language Models
☆116Updated last year
fangyuan-ksgk / CoT-Reasoning-without-Prompting
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆32Updated last year
tonychenxyz / selfie
This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…
☆50Updated 7 months ago
peterljq / Parsimonious-Concept-Engineering
PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)
☆39Updated 8 months ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆212Updated 6 months ago
zjunlp / unlearn
[ACL 2025] Knowledge Unlearning for Large Language Models
☆39Updated 2 months ago
cxcscmu / Montessori-Instruct
Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]
☆47Updated 6 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆103Updated 3 weeks ago
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆125Updated 4 months ago
princeton-nlp / Edge-Pruning
[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".
☆59Updated last week
alisawuffles / proxy-tuning
Code associated with Tuning Language Models by Proxy (Liu et al., 2024)
☆114Updated last year
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆81Updated last month
YangLing0818 / SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆76Updated 4 months ago
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
StarDewXXX / O1-Pruner
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆86Updated 5 months ago
sail-sg / VeriFree
Reinforcing General Reasoning without Verifiers
☆76Updated last month
Leooyii / LCEG
Long Context Extension and Generalization in LLMs
☆58Updated 10 months ago
kamanphoebe / Look-into-MoEs
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆52Updated 5 months ago
ChenmienTan / malmen
☆34Updated last year
zzwjames / FailureLLMUnlearning
An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)
☆29Updated 5 months ago
architsharma97 / dpo-rlaif
☆99Updated last year
ScalerLab / JudgeBench
☆91Updated 8 months ago
Glaciohound / LM-Steer
Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)
☆123Updated 3 weeks ago