MurtyShikhar / structural-grokking
Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"
☆21Updated last year
Alternatives and similar repositories for structural-grokking
Users that are interested in structural-grokking are comparing it to the libraries listed below
Sorting:
- ☆33Updated last year
- ☆34Updated last year
- ☆45Updated last year
- ☆52Updated 11 months ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆61Updated 2 years ago
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22☆65Updated 2 years ago
- Code for Pushdown Layers from our EMNLP 2023 paper☆28Updated last year
- Universal Neurons in GPT2 Language Models☆29Updated 11 months ago
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆91Updated 3 years ago
- ☆64Updated 2 years ago
- ☆85Updated last year
- ☆13Updated 9 months ago
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆37Updated 2 years ago
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023…☆29Updated last year
- ☆21Updated last year
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19Updated last year
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆76Updated last year
- ☆107Updated 2 years ago
- ☆83Updated 9 months ago
- A library for efficient patching and automatic circuit discovery.☆64Updated 3 weeks ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆42Updated 5 months ago
- The data and the PyTorch implementation for the models and experiments in the paper "Language Model Decoding as Likelihood–Utility Alignm…☆14Updated last year
- ☆92Updated 10 months ago
- A framework for few-shot evaluation of autoregressive language models.☆24Updated last year
- ☆39Updated 2 years ago
- Self-Supervised Alignment with Mutual Information☆18Updated 11 months ago
- Distributional Generalization in NLP. A roadmap.☆88Updated 2 years ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆54Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- ☆34Updated 4 months ago