misko / human_descentLinks
☆37Updated last week
Alternatives and similar repositories for human_descent
Users that are interested in human_descent are comparing it to the libraries listed below
Sorting:
- Getting crystal-like representations with harmonic loss☆192Updated 7 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆296Updated last year
- 🧱 Modula software package☆307Updated 3 months ago
- ☆20Updated 7 months ago
- The boundary of neural network trainability is fractal☆221Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆106Updated 2 months ago
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆174Updated 2 years ago
- ☆285Updated last year
- ☆201Updated 3 months ago
- Simple Transformer in Jax☆139Updated last year
- Graph neural networks in JAX.☆68Updated last year
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆147Updated last month
- ☆44Updated 3 weeks ago
- ☆224Updated 11 months ago
- WIP☆93Updated last year
- ☆460Updated last year
- The history files when recording human interaction while solving ARC tasks☆118Updated 2 weeks ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆194Updated last year
- An implementation of PSGD Kron second-order optimizer for PyTorch☆97Updated 4 months ago
- Bare-bones implementations of some generative models in Jax: diffusion, normalizing flows, consistency models, flow matching, (beta)-VAEs…☆137Updated last year
- Cellular Automata Accelerated in JAX (Oral at ICLR 2025)☆232Updated 3 weeks ago
- Minimal yet performant LLM examples in pure JAX☆202Updated 2 months ago
- Exact OU processes with JAX☆56Updated 8 months ago
- ☆530Updated 3 months ago
- ☆28Updated 2 months ago
- σ-GPT: A New Approach to Autoregressive Models☆70Updated last year
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆104Updated 2 months ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆186Updated last year
- Minimal GPT (~350 lines with a simple task to test it)☆63Updated 11 months ago
- ☆56Updated last year