misko / human_descent
โ36Updated 2 months ago
Alternatives and similar repositories for human_descent:
Users that are interested in human_descent are comparing it to the libraries listed below
- Latent Program Network (from the "Searching Latent Program Spaces" paper)โ48Updated 2 months ago
- A package for defining deep learning models using categorical algebraic expressions.โ59Updated 6 months ago
- ๐งฑ Modula software packageโ139Updated this week
- Flow-matching algorithms in JAXโ83Updated 6 months ago
- Exact OU processes with JAXโ41Updated 4 months ago
- Graph neural networks in JAX.โ67Updated 7 months ago
- โ59Updated 2 years ago
- Simple Transformer in Jaxโ136Updated 7 months ago
- supporting pytorch FSDP for optimizersโ76Updated 2 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resourcesโ119Updated last week
- Cellular Automata Accelerated in JAX (Oral at ICLR 2025).โ81Updated 2 months ago
- โ47Updated 2 months ago
- Because we don't want a jupyter notebook mess...โ61Updated last month
- โ21Updated 4 months ago
- A MAD laboratory to improve AI architecture designs ๐งชโ102Updated last month
- An implementation of PSGD Kron second-order optimizer for PyTorchโ80Updated this week
- The AdEMAMix Optimizer: Better, Faster, Older.โ177Updated 5 months ago
- โ25Updated last year
- Focused on fast experimentation and simplicityโ65Updated last month
- The history files when recording human interaction while solving ARC tasksโ97Updated this week
- look how they massacred my boyโ63Updated 3 months ago
- The boundary of neural network trainability is fractalโ194Updated last year
- ฯ-GPT: A New Approach to Autoregressive Modelsโ61Updated 6 months ago
- โ75Updated 7 months ago
- WIPโ93Updated 6 months ago
- โ53Updated last year
- Implementation of Diffusion Transformer (DiT) in JAXโ264Updated 8 months ago
- An introduction to LLM Samplingโ75Updated 2 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ121Updated 9 months ago