Omnigrok: Grokking Beyond Algorithmic Data
☆63Feb 24, 2023Updated 3 years ago
Alternatives and similar repositories for Omnigrok
Users that are interested in Omnigrok are comparing it to the libraries listed below
Sorting:
- We study toy models of skill learning.☆32Feb 3, 2026Updated last month
- ☆16Feb 28, 2025Updated last year
- ☆27Feb 1, 2023Updated 3 years ago
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆14Jul 25, 2023Updated 2 years ago
- ☆12Jan 9, 2024Updated 2 years ago
- ☆18Feb 19, 2024Updated 2 years ago
- Source repository for NeuroML documentation.☆18Updated this week
- ☆36Dec 12, 2023Updated 2 years ago
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- The official repository for AdaMuon☆35Aug 27, 2025Updated 6 months ago
- Fork of Flame repo for training of some new stuff in development☆19Feb 27, 2026Updated last week
- Conditional RBM☆15May 24, 2025Updated 9 months ago
- ☆18Mar 13, 2016Updated 9 years ago
- ☆20Mar 1, 2023Updated 3 years ago
- A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations☆15Apr 15, 2024Updated last year
- A python implementation of the Ensemble Biclustering for Classification (EBC) algorithm. EBC is a co-clustering algorithm that allows you…☆20Apr 7, 2017Updated 8 years ago
- Schrodinger Principal Component Analysis☆23Jun 5, 2020Updated 5 years ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆60Feb 7, 2025Updated last year
- ☆27May 3, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- Official Code Repository for the paper "Key-value memory in the brain"☆31Feb 25, 2025Updated last year
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆35Apr 8, 2023Updated 2 years ago
- Fit a chess Win-Draw-Loss model from played games☆32Jan 16, 2026Updated last month
- ☆28Mar 18, 2023Updated 2 years ago
- NeurIPS23 "Flow Factorized Representation Learning"☆43Dec 15, 2025Updated 2 months ago
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆37Mar 3, 2025Updated last year
- BitLinear implementation☆35Jan 1, 2026Updated 2 months ago
- Deep Networks Grok All the Time and Here is Why☆38May 18, 2024Updated last year
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆30Mar 14, 2024Updated last year
- ☆80Oct 17, 2024Updated last year
- Deeply supervised density regression for automatic cell counting in microscopy images☆12Jan 31, 2022Updated 4 years ago
- Official repository for ACM Multimedia'23 paper "MATK: The Meme Analytical Tool Kit"☆13May 29, 2024Updated last year
- A Lattice Library for Julia☆35May 15, 2025Updated 9 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆86Jul 28, 2024Updated last year
- (CVPR 2024) Accelerating Neural Field Training via Soft Mining☆41Dec 2, 2025Updated 3 months ago
- ☆44Jan 24, 2024Updated 2 years ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Jun 13, 2024Updated last year
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆39Mar 11, 2025Updated 11 months ago
- Cell Segmenter using Machine Learning☆10Aug 1, 2023Updated 2 years ago