unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆90Jul 4, 2022Updated 3 years ago
Alternatives and similar repositories for grokking
Users that are interested in grokking are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Demonstration of the grokking phenomenon in machine learning in a simple case☆66Feb 8, 2025Updated last year
- Omnigrok: Grokking Beyond Algorithmic Data☆63Feb 24, 2023Updated 3 years ago
- A machine learning library capable of training various deep neural networks (RNNs, LSTMs, DBNs, ect...) on a GPU. It makes use of auto-di…☆10Aug 28, 2018Updated 7 years ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 10 months ago
- ☆27Feb 1, 2023Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- In-silico design pipeline for evaluating protein structure diffusion models.☆30Jun 25, 2024Updated last year
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆29Oct 23, 2025Updated 5 months ago
- Tabula Rasa Tic-Tac-Toe☆10Jan 3, 2019Updated 7 years ago
- Code for the AISTATS 2024 Paper "From Data Imputation to Data Cleaning - Automated Cleaning of Tabular Data Improves Downstream Predictiv…☆24Feb 14, 2024Updated 2 years ago
- Contribute to SOTAPapers.com — the most comprehensive research discovery platform. Submit new papers, request features, report issues, an…☆27Aug 14, 2025Updated 8 months ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- Understanding RL vision Distill article☆25Mar 3, 2023Updated 3 years ago
- ☆14Apr 9, 2025Updated last year
- Explore and Control with Adversarial Surprise☆10Jul 20, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This is a submission example for CelebA-Spoof Challenge participants.☆10Sep 8, 2020Updated 5 years ago
- Tools for studying developmental interpretability in neural networks.☆129Dec 30, 2025Updated 3 months ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆35Oct 28, 2025Updated 5 months ago
- Piece-wise Linear curves converted to point-clouds, analysed with Persistent Homology, represented as HyperGraphs.☆16Apr 3, 2025Updated last year
- ☆36Feb 16, 2025Updated last year
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆25Sep 13, 2024Updated last year
- A Framework for Machine Learning on Encrypted Data☆12Feb 10, 2022Updated 4 years ago
- Universal Neurons in GPT2 Language Models☆30May 28, 2024Updated last year
- Exploring Model Kinship for Merging Large Language Models☆28Apr 16, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 源心社区的第一个开源项目:通过软件实现TRIZ理论。我们希望通过这个开源项目帮助更多人和组织创造性地解决问题☆16Apr 18, 2016Updated 10 years ago
- Utilities for the HuggingFace transformers library☆75Jan 21, 2023Updated 3 years ago
- A formalisation of Cartesian Frames, a perspective on embedded agency, in the HOL theorem prover.☆20Dec 20, 2021Updated 4 years ago
- This project aims to extract ROI like finger tip, Palmprint and Hand-geometry from a single hand image.☆10Aug 24, 2023Updated 2 years ago
- Minimum Description Length Recurrent Neural Networks☆19Jun 9, 2023Updated 2 years ago
- Prioritized Sequence Experience Replay☆10Aug 16, 2021Updated 4 years ago
- ☆15Jun 10, 2022Updated 3 years ago
- Code for "Just Train Twice: Improving Group Robustness without Training Group Information"☆72May 18, 2024Updated last year
- ☆15Apr 1, 2020Updated 6 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Apr 17, 2024Updated 2 years ago
- Ini kumpulan beberapa materi lab pada Digitalent Schoolarship Python Essentials 2019☆10Mar 27, 2022Updated 4 years ago
- Official repo of paper LM2☆47Feb 13, 2025Updated last year
- See the issue board for the current status of active and prospective projects!☆65Feb 12, 2022Updated 4 years ago
- code associated with paper "Sparse Bayesian Optimization"☆26Oct 31, 2023Updated 2 years ago
- [ICCV 2021] Click to Move: Controlling Video Generation with Sparse Motion☆11Apr 14, 2023Updated 3 years ago
- Computationally friendly hyper-parameter search with DP-SGD☆25Jan 7, 2025Updated last year