Sea-Snell / grokkingView external linksLinks
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆83Jul 4, 2022Updated 3 years ago
Alternatives and similar repositories for grokking
Users that are interested in grokking are comparing it to the libraries listed below
Sorting:
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆42Sep 23, 2023Updated 2 years ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 3 months ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- A machine learning library capable of training various deep neural networks (RNNs, LSTMs, DBNs, ect...) on a GPU. It makes use of auto-di…☆10Aug 28, 2018Updated 7 years ago
- Official Implementation of PatentLMM (our AAAI 2025 Paper)☆16Jan 28, 2025Updated last year
- ☆19Mar 25, 2025Updated 10 months ago
- ARLC, a probabilistic abductive reasoner for solving Raven's progressive matrices.☆20Sep 18, 2025Updated 4 months ago
- A browser extension providing Open Access bibliographical services☆18Dec 9, 2022Updated 3 years ago
- Graph Transformers for Large Graphs☆22Apr 26, 2024Updated last year
- ☆14Jul 22, 2021Updated 4 years ago
- Exploring Model Kinship for Merging Large Language Models☆27Apr 16, 2025Updated 9 months ago
- Official Implementation for NorMuon paper☆55Updated this week
- code associated with paper "Sparse Bayesian Optimization"☆26Oct 31, 2023Updated 2 years ago
- Understanding RL vision Distill article☆25Mar 3, 2023Updated 2 years ago
- ☆27Feb 1, 2023Updated 3 years ago
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆25Sep 13, 2024Updated last year
- Universal Neurons in GPT2 Language Models☆30May 28, 2024Updated last year
- Kaggle AIMO2 solution with token-efficient reasoning LLM recipes☆42Aug 7, 2025Updated 6 months ago
- ☆30Jan 17, 2022Updated 4 years ago
- MLX implementation of xLSTM model by Beck et al. (2024)☆31Jun 5, 2024Updated last year
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆72Mar 26, 2023Updated 2 years ago
- ☆33May 15, 2024Updated last year
- Official implementation of "BERTs are Generative In-Context Learners"☆32Mar 14, 2025Updated 11 months ago
- ⚡ C̷h̷u̷c̷k̷N̷o̷r̷r̷i̷s̷ MCP server: Helping LLMs break limits. Provides enhancement prompts inspired by elder-plinius' L1B3RT4S☆54Apr 11, 2025Updated 10 months ago
- ☆26Updated this week
- Simple Scalable Discrete Diffusion for text in PyTorch☆37Sep 27, 2024Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Apr 17, 2024Updated last year
- Repository of IPBench☆19Jan 4, 2026Updated last month
- Jeroen Cottaar's work for the Kaggle Geophysical Waveform Inversion competition (2nd place)☆11Aug 11, 2025Updated 6 months ago
- Automated Continuous Data Quality Measurement☆12Nov 15, 2023Updated 2 years ago
- HWFLY/SX Firmware based on https://github.com/hwfly-nx/firmware with Instinct-NX loader and toshiba timeout patch☆13Aug 22, 2023Updated 2 years ago
- A Model Context Protocol server that provides documentation access capabilities. This server enables LLMs to search and retrieve content …☆18Apr 29, 2025Updated 9 months ago
- Code for "Just Train Twice: Improving Group Robustness without Training Group Information"☆73May 18, 2024Updated last year
- ☆36Mar 12, 2025Updated 11 months ago
- Interpretating the latent space representations of attention head outputs for LLMs☆36Aug 13, 2024Updated last year
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆86Jul 28, 2024Updated last year
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆242May 12, 2023Updated 2 years ago
- This project aims to extract ROI like finger tip, Palmprint and Hand-geometry from a single hand image.☆10Aug 24, 2023Updated 2 years ago