unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆90Jul 4, 2022Updated 3 years ago
Alternatives and similar repositories for grokking
Users that are interested in grokking are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PyTorch implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆39Dec 7, 2021Updated 4 years ago
- ☆12Jul 30, 2025Updated 10 months ago
- Code for the paper "Function-Space Learning Rates"☆24Jun 3, 2025Updated last year
- ☆28Feb 1, 2023Updated 3 years ago
- ☆21Aug 18, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code repository for the research project "You Play Ball, I Play Ball: Bayesian Multi-Agent Reinforcement Learning for Slime Volleyball", …☆17Nov 15, 2020Updated 5 years ago
- In-silico design pipeline for evaluating protein structure diffusion models.☆31Jun 25, 2024Updated last year
- Official Implementation of PatentLMM (our AAAI 2025 Paper)☆26Updated this week
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆31May 21, 2026Updated 3 weeks ago
- ☆16Feb 6, 2024Updated 2 years ago
- The Lean Theorem Proving Environment☆15May 7, 2023Updated 3 years ago
- Code for the AISTATS 2024 Paper "From Data Imputation to Data Cleaning - Automated Cleaning of Tabular Data Improves Downstream Predictiv…☆24Feb 14, 2024Updated 2 years ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- command line fractal rendering☆13Mar 25, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆55Apr 11, 2023Updated 3 years ago
- Tools for studying developmental interpretability in neural networks.☆140Apr 23, 2026Updated last month
- ☆14Mar 4, 2024Updated 2 years ago
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Jun 11, 2023Updated 3 years ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆35Oct 28, 2025Updated 7 months ago
- Piece-wise Linear curves converted to point-clouds, analysed with Persistent Homology, represented as HyperGraphs.☆16Apr 3, 2025Updated last year
- ☆37Feb 16, 2025Updated last year
- Universal Neurons in GPT2 Language Models☆30May 28, 2024Updated 2 years ago
- Source code of "What can linearized neural networks actually say about generalization?☆20Oct 21, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Exploring Model Kinship for Merging Large Language Models☆29Apr 16, 2025Updated last year
- Code for the paper Alpha Zero in Continuous Action Space (A0C) (https://arxiv.org/pdf/1805.09613.pdf)☆15Jan 19, 2021Updated 5 years ago
- On the roots of beauty☆13Nov 27, 2022Updated 3 years ago
- Utilities for the HuggingFace transformers library☆76Jan 21, 2023Updated 3 years ago
- Code related to my Bachelor's Thesis Project☆13Jun 17, 2016Updated 10 years ago
- Offline RL experiments☆15Oct 1, 2022Updated 3 years ago
- Implementation of Oridinal Classification Paper using Logistic Regression and SVM☆12Jun 10, 2017Updated 9 years ago
- Reproduction of AlphaTensor paper for 2x2 matrices☆16Nov 5, 2023Updated 2 years ago
- ☆15Jun 10, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An agent for playing Atari games running on a Teensy microcontroller☆15Nov 11, 2022Updated 3 years ago
- Code for "Just Train Twice: Improving Group Robustness without Training Group Information"☆73May 18, 2024Updated 2 years ago
- ☆15Apr 1, 2020Updated 6 years ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Apr 17, 2024Updated 2 years ago
- Ini kumpulan beberapa materi lab pada Digitalent Schoolarship Python Essentials 2019☆10Mar 27, 2022Updated 4 years ago
- Official repo of paper LM2☆48Feb 13, 2025Updated last year
- See the issue board for the current status of active and prospective projects!☆65Feb 12, 2022Updated 4 years ago