misko / human_descentLinks
☆36Updated 6 months ago
Alternatives and similar repositories for human_descent
Users that are interested in human_descent are comparing it to the libraries listed below
Sorting:
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated 11 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆87Updated 3 months ago
- 🧱 Modula software package☆200Updated 3 months ago
- ☆126Updated last month
- ☆43Updated 3 weeks ago
- Graph neural networks in JAX.☆67Updated last year
- Minimal GPT (~350 lines with a simple task to test it)☆62Updated 6 months ago
- ☆27Updated last year
- ☆270Updated 11 months ago
- WIP☆93Updated 10 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorch☆91Updated 2 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆140Updated last month
- Flow-matching algorithms in JAX☆97Updated 10 months ago
- Simple Transformer in Jax☆137Updated last year
- ☆60Updated 3 years ago
- Getting crystal-like representations with harmonic loss☆190Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆139Updated this week
- A graph visualization of attention☆56Updated last month
- supporting pytorch FSDP for optimizers☆82Updated 6 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆278Updated last year
- Because we don't want a jupyter notebook mess...☆61Updated 2 weeks ago
- The history files when recording human interaction while solving ARC tasks☆112Updated 2 weeks ago
- A functional training loops library for JAX☆88Updated last year
- ☆78Updated 11 months ago
- ☆98Updated 5 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆66Updated 2 months ago
- Your favourite classical machine learning algos on the GPU/TPU☆20Updated 5 months ago
- σ-GPT: A New Approach to Autoregressive Models☆65Updated 10 months ago
- JAX Arrays for human consumption☆93Updated last week
- Computational abilities and efficiency of neural networks☆52Updated last week