Luckfort / CD
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
☆78Updated 3 months ago
Alternatives and similar repositories for CD:
Users that are interested in CD are comparing it to the libraries listed below
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated 2 months ago
- This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…☆48Updated 5 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆145Updated 2 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆166Updated 3 weeks ago
- ☆97Updated 2 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆112Updated last year
- ☆165Updated last month
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆72Updated 6 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆77Updated 6 months ago
- ☆99Updated last week
- ☆37Updated last year
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆11Updated last year
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆57Updated last year
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆70Updated 2 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆119Updated last month
- Code for "Reasoning to Learn from Latent Thoughts"☆93Updated last month
- ☆23Updated last month
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆57Updated 5 months ago
- ☆59Updated 3 weeks ago
- AnchorAttention: Improved attention for LLMs long-context training☆207Updated 3 months ago
- ☆29Updated 2 months ago
- ☆97Updated 10 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆54Updated 2 months ago
- A brief and partial summary of RLHF algorithms.☆128Updated 2 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 6 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆109Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆160Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆104Updated last year
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 6 months ago