open-thought / arc-agi-2
Building the cognitive-core to solve ARC-AGI-2
โ17Updated 2 weeks ago
Alternatives and similar repositories for arc-agi-2:
Users that are interested in arc-agi-2 are comparing it to the libraries listed below
- A MAD laboratory to improve AI architecture designs ๐งชโ102Updated 2 months ago
- โ49Updated last year
- โ75Updated 7 months ago
- โ53Updated last year
- nanoGPT-like codebase for LLM trainingโ89Updated this week
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.โ18Updated 8 months ago
- Experiment of using Tangent to autodiff tritonโ75Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ95Updated 3 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"โ60Updated 2 years ago
- โ52Updated 4 months ago
- Implementation of PSGD optimizer in JAXโ28Updated last month
- A simple library for scaling up JAX programsโ129Updated 3 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ122Updated 10 months ago
- supporting pytorch FSDP for optimizersโ76Updated 2 months ago
- seqax = sequence modeling + JAXโ143Updated 7 months ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.โ57Updated 3 weeks ago
- Minimal but scalable implementation of large language models in JAXโ32Updated 3 months ago
- Triton Implementation of HyperAttention Algorithmโ46Updated last year
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"โ72Updated 2 years ago
- WIPโ93Updated 6 months ago
- A basic pure pytorch implementation of flash attentionโ16Updated 3 months ago
- Understand and test language model architectures on synthetic tasks.โ181Updated last month
- โ51Updated 9 months ago
- A puzzle to learn about promptingโ124Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)โ53Updated 2 months ago
- โ211Updated 7 months ago
- Jax like function transformation engine but micro, microjaxโ30Updated 3 months ago
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ81Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paperโ58Updated last year
- โ47Updated 2 months ago