TomFrederik / grokkingView external linksLinks
Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'
☆39Dec 4, 2021Updated 4 years ago
Alternatives and similar repositories for grokking
Users that are interested in grokking are comparing it to the libraries listed below
Sorting:
- The original weights of some Caffe models, ported to PyTorch.☆11Jan 18, 2022Updated 4 years ago
- A better PyTorch implementation of image local attention which reduces the GPU memory by an order of magnitude.☆142Dec 21, 2021Updated 4 years ago
- A basic implementation of the paper Eigengame : PCA as a Nash Equilibrium☆21Jun 7, 2021Updated 4 years ago
- Self-Similarity Priors: Neural Collages as Differentiable Fractal Representations☆29Nov 26, 2022Updated 3 years ago
- Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)☆54Nov 10, 2021Updated 4 years ago
- Visual search interface☆11Nov 30, 2021Updated 4 years ago
- Official repository for the paper: "Trees with Attention for Set Prediction Tasks" (ICML21)☆10Jan 19, 2022Updated 4 years ago
- Framework for stochastic modelling in systems biology☆12Aug 11, 2022Updated 3 years ago
- DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.☆15Mar 9, 2022Updated 3 years ago
- ☆10Sep 13, 2021Updated 4 years ago
- VQGAN+CLIP with some additional tuning. For notebooks and the command line.☆50Aug 20, 2021Updated 4 years ago
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆29Sep 25, 2021Updated 4 years ago
- ☆20Aug 19, 2021Updated 4 years ago
- Finetune the 1.4B latent diffusion text2img-large checkpoint from CompVis using deepspeed. (work-in-progress)☆36Apr 17, 2022Updated 3 years ago
- Repo for storing the files I use to make animations with big-sleep, deep-daze, and VQGAN + CLIP.☆16Sep 14, 2021Updated 4 years ago
- Produce intelligence by means of natural selection without objective/reward optimization☆15Sep 29, 2021Updated 4 years ago
- This repository hosts code for converting the original MLP Mixer models (JAX) to TensorFlow.☆15Sep 29, 2021Updated 4 years ago
- Contrastive Language-Audio Pretraining☆15May 18, 2021Updated 4 years ago
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆88Dec 3, 2021Updated 4 years ago
- Variational autoencoder for Lego minifig faces☆16May 22, 2023Updated 2 years ago
- ESGD-M is a stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch.☆56Sep 18, 2022Updated 3 years ago
- Digital paint mixing program based on the Kubelka-Munk equations. Implementation of : T. Lindemeier, J. M. Gülzow, and O. Deussen. 2018…☆15Sep 10, 2020Updated 5 years ago
- CLOOB training (JAX) and inference (JAX and PyTorch)☆74May 16, 2022Updated 3 years ago
- Texture mapping with variational auto-encoders☆40Oct 1, 2021Updated 4 years ago
- Repository for the "Gotta Go Fast When Generating Data with Score-Based Models" paper☆105Nov 20, 2021Updated 4 years ago
- ☆19Oct 3, 2022Updated 3 years ago
- An adaptive training algorithm for residual network☆17Aug 22, 2020Updated 5 years ago
- SNAIL Attention Block for Keras.☆17Mar 30, 2020Updated 5 years ago
- Refining continuous-in-depth neural networks☆42Nov 14, 2021Updated 4 years ago
- ☆18Jan 8, 2024Updated 2 years ago
- Code for: "Neural Rough Differential Equations for Long Time Series", (ICML 2021)☆122May 11, 2021Updated 4 years ago
- ☆22Aug 14, 2021Updated 4 years ago
- Understanding Self-Supervised Learning in a non-IID Setting☆21Oct 21, 2022Updated 3 years ago
- Unofficial Alias-Free GAN implementation. Based on rosinality's version with expanded training and inference options.☆76Aug 3, 2023Updated 2 years ago
- Understanding the Difficulty of Training Transformers☆47Oct 30, 2022Updated 3 years ago
- Contrastive Language-Audio Pretraining☆88Mar 6, 2022Updated 3 years ago
- Code for "Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning"☆415Mar 21, 2024Updated last year
- Computes the strongly connected components of a directed graph☆29Apr 28, 2016Updated 9 years ago
- A CLIP conditioned Decision Transformer.☆22Jul 14, 2021Updated 4 years ago