francoisfleuret / picogptLinks
Minimal GPT (~350 lines with a simple task to test it)
☆62Updated 9 months ago
Alternatives and similar repositories for picogpt
Users that are interested in picogpt are comparing it to the libraries listed below
Sorting:
- ☆150Updated last year
- ☆44Updated 2 months ago
- The boundary of neural network trainability is fractal☆217Updated last year
- Getting crystal-like representations with harmonic loss☆194Updated 6 months ago
- Jax like function transformation engine but micro, microjax☆32Updated 11 months ago
- σ-GPT: A New Approach to Autoregressive Models☆68Updated last year
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- An implementation of PSGD Kron second-order optimizer for PyTorch☆95Updated 2 months ago
- Because we don't want a jupyter notebook mess...☆61Updated 4 months ago
- ☆28Updated last year
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆147Updated last week
- Implementation of the proposed Spline-Based Transformer from Disney Research☆104Updated 11 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆102Updated 9 months ago
- ☆65Updated 11 months ago
- Diffusion models in PyTorch☆111Updated 3 weeks ago
- ☆53Updated last year
- Various handy scripts to quickly setup new Linux and Windows sandboxes, containers and WSL.☆40Updated this week
- ☆28Updated last week
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Updated 8 months ago
- H-Net Dynamic Hierarchical Architecture☆80Updated last month
- Code for "Training-free Graph Neural Networks and the Power of Labels as Features" (TMLR 2024)☆57Updated last year
- Because we don't have enough time to read everything☆89Updated last year
- Exploration into the Firefly algorithm in Pytorch☆41Updated 7 months ago
- Graph neural networks in JAX.☆68Updated last year
- ☆81Updated last year
- lossily compress representation vectors using product quantization☆59Updated 5 months ago
- ☆60Updated 3 years ago
- Induce brain-like topographic structure in your neural networks☆69Updated 2 months ago
- ☆34Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆98Updated last week