Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆234Jul 19, 2025Updated 7 months ago
Alternatives and similar repositories for GrokkedTransformer
Users that are interested in GrokkedTransformer are comparing it to the libraries listed below
Sorting:
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- ☆19Mar 25, 2025Updated 11 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Oct 3, 2024Updated last year
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆577Jun 28, 2024Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆80Apr 12, 2024Updated last year
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆16Oct 31, 2024Updated last year
- Pretraining and inference code for a large-scale depth-recurrent language model☆864Dec 29, 2025Updated 2 months ago
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆23Dec 4, 2024Updated last year
- Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"☆24Oct 8, 2023Updated 2 years ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆84Nov 27, 2024Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Jun 11, 2025Updated 8 months ago
- ☆58Nov 19, 2024Updated last year
- ☆1,033Dec 17, 2024Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Nov 17, 2024Updated last year
- ☆130Oct 1, 2024Updated last year
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆32Mar 2, 2025Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- ☆15Feb 21, 2024Updated 2 years ago
- ☆111Jul 23, 2025Updated 7 months ago
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 10 months ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- GoldFinch and other hybrid transformer components☆45Jul 20, 2024Updated last year
- ☆91Aug 18, 2024Updated last year
- ☆33Jan 7, 2025Updated last year
- ☆19Mar 31, 2024Updated last year
- ☆21Oct 22, 2025Updated 4 months ago
- [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective☆42Sep 18, 2025Updated 5 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆952Nov 16, 2025Updated 3 months ago
- Codes and Data for ACL 2024 Paper "Faithful Logical Reasoning via Symbolic Chain-of-Thought".☆201Jan 29, 2026Updated last month
- Code for Quiet-STaR☆741Aug 21, 2024Updated last year
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆75Jun 23, 2025Updated 8 months ago
- A library for efficient patching and automatic circuit discovery.☆90Dec 31, 2025Updated 2 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆154Feb 3, 2025Updated last year
- 😊 TPTT: Transforming Pretrained Transformers into Titans☆59Nov 24, 2025Updated 3 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29May 22, 2025Updated 9 months ago
- ☆23Jul 5, 2024Updated last year