MurtyShikhar / structural-grokking
Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"
☆21Updated last year
Related projects ⓘ
Alternatives and complementary repositories for structural-grokking
- ☆50Updated 6 months ago
- ☆44Updated last year
- ☆28Updated last year
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.☆118Updated last month
- A library for efficient patching and automatic circuit discovery.☆31Updated last month
- ☆75Updated last month
- ☆29Updated 7 months ago
- ☆73Updated 4 months ago
- A framework for few-shot evaluation of autoregressive language models.☆23Updated 11 months ago
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆85Updated 3 years ago
- ☆128Updated 10 months ago
- ☆103Updated 4 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆97Updated 2 months ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆55Updated last year
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆34Updated last year
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆80Updated last week
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆84Updated 8 months ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆69Updated last year
- ☆40Updated 2 years ago
- ☆89Updated 11 months ago
- ☆25Updated 4 months ago
- ☆20Updated last year
- ☆55Updated last month
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆95Updated last year
- ☆24Updated 2 months ago
- Universal Neurons in GPT2 Language Models☆27Updated 5 months ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆17Updated last year
- ☆33Updated 8 months ago
- ☆21Updated 2 months ago