Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆234Jul 19, 2025Updated 8 months ago
Alternatives and similar repositories for GrokkedTransformer
Users that are interested in GrokkedTransformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆23Dec 4, 2024Updated last year
- ☆19Mar 25, 2025Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆80Apr 12, 2024Updated last year
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 9 months ago
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆16Oct 31, 2024Updated last year
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆33Mar 2, 2025Updated last year
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Oct 3, 2024Updated last year
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆85Jan 12, 2025Updated last year
- Pretraining and inference code for a large-scale depth-recurrent language model☆865Dec 29, 2025Updated 2 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆84Nov 27, 2024Updated last year
- ☆19May 3, 2025Updated 10 months ago
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆47May 31, 2024Updated last year
- ☆58Nov 19, 2024Updated last year
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Jun 11, 2025Updated 9 months ago
- ☆115Jul 23, 2025Updated 8 months ago
- Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"☆24Oct 8, 2023Updated 2 years ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- ☆33Jan 7, 2025Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- Codes and Data for ACL 2024 Paper "Faithful Logical Reasoning via Symbolic Chain-of-Thought".☆201Jan 29, 2026Updated last month
- [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective☆43Sep 18, 2025Updated 6 months ago
- ☆1,033Dec 17, 2024Updated last year
- Official code for paper Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation☆21Feb 29, 2024Updated 2 years ago
- A library for efficient patching and automatic circuit discovery.☆93Dec 31, 2025Updated 2 months ago
- ☆15Feb 21, 2024Updated 2 years ago
- ☆91Aug 18, 2024Updated last year
- ☆131Oct 1, 2024Updated last year
- Code for reproducing the ACL'23 paper: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments☆78May 17, 2025Updated 10 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,536Aug 12, 2025Updated 7 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated 11 months ago
- Engine for collecting, uploading, and downloading model activations☆27Apr 2, 2025Updated 11 months ago
- ☆113Feb 11, 2025Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆98Nov 17, 2024Updated last year
- Code for Quiet-STaR☆741Aug 21, 2024Updated last year
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 11 months ago
- GoldFinch and other hybrid transformer components☆45Jul 20, 2024Updated last year