Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"
☆24Oct 8, 2023Updated 2 years ago
Alternatives and similar repositories for structural-grokking
Users that are interested in structural-grokking are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for Pushdown Layers from our EMNLP 2023 paper☆29Dec 3, 2023Updated 2 years ago
- Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Langu…☆39Dec 18, 2023Updated 2 years ago
- A Concept-Centric Framework for Intelligent Agents☆23Oct 1, 2025Updated 6 months ago
- Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image☆12May 10, 2025Updated 11 months ago
- Tasks for describing differences between text distributions.☆17Aug 9, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Dec 7, 2022Updated 3 years ago
- Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models☆15Jun 12, 2023Updated 2 years ago
- MAGIC: Microlensing Analysis Guided by Intelligent Computation. A PyTorch framework for automatic analysis of realistic microlensing ligh…☆13May 30, 2024Updated last year
- Code for ICCV2021 paper: Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images☆15Jan 24, 2023Updated 3 years ago
- ☆11Oct 3, 2022Updated 3 years ago
- JEEBench, EMNLP 2023☆44Dec 18, 2023Updated 2 years ago
- ☆17Oct 22, 2024Updated last year
- Library for the training and evaluation of object-centric models (ICML 2022)☆71Apr 30, 2023Updated 2 years ago
- ☆18Apr 13, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆43Sep 23, 2023Updated 2 years ago
- [TMLR 2026] GIOROM, sampling based model-order reduction for Lagrangian systems☆20Mar 12, 2026Updated last month
- Analyze the dynamic stability of SGD☆13Nov 25, 2018Updated 7 years ago
- A dataset of 80 millon constraint preserving transformations of CAD sketches☆14Nov 22, 2024Updated last year
- ☆23Nov 8, 2023Updated 2 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Oct 3, 2021Updated 4 years ago
- Generate ics file given a set of courses and slots☆12Sep 16, 2024Updated last year
- Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …☆14Aug 25, 2023Updated 2 years ago
- Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning☆11Jul 20, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Code for "SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields" (ECCV 2024)☆12Oct 30, 2024Updated last year
- This is the official code repository for the paper "Language Agents Meet Causality -- Bridging LLMs and Causal World Models"☆29May 6, 2025Updated 11 months ago
- Multimodal extreme classification☆21May 1, 2024Updated last year
- ☆10Oct 20, 2023Updated 2 years ago
- Code for "Does syntax need to grow on trees? Sources of inductive bias in sequence to sequence networks"☆24Jan 14, 2020Updated 6 years ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆236Jul 19, 2025Updated 8 months ago
- Official implementation of MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning☆19Jan 27, 2024Updated 2 years ago
- ☆48Oct 2, 2025Updated 6 months ago
- Code for "Multi-scale Abstract Reasoning" paper☆12Oct 17, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Menagerie of video models trained on various video datasets☆10Oct 13, 2024Updated last year
- ☆14May 3, 2022Updated 3 years ago
- (NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights☆27Oct 28, 2024Updated last year
- Code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2021 and "…☆18Feb 15, 2022Updated 4 years ago
- [NeurIPS 2023] Learning Energy-Based Prior Model with Diffusion-Amortized MCMC☆13Mar 1, 2026Updated last month
- Delete Unwanted Bibliography fields from bibtex (.bib) files☆24Dec 24, 2018Updated 7 years ago
- The official implementation of “MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction”☆55Mar 20, 2026Updated 3 weeks ago