google-deepmind / neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
☆187Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for neural_networks_chomsky_hierarchy
- ☆54Updated 2 years ago
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆169Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆97Updated 2 years ago
- ☆251Updated 2 years ago
- Mechanistic Interpretability for Transformer Models☆49Updated 2 years ago
- Language-annotated Abstraction and Reasoning Corpus☆78Updated last year
- ☆161Updated last year
- A domain-specific probabilistic programming language for modeling and inference with language models☆112Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated last year
- ☆188Updated last month
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…☆196Updated 5 months ago
- [NeurIPS 2023] Learning Transformer Programs☆157Updated 6 months ago
- Train very large language models in Jax.☆195Updated last year
- Learning Universal Predictors☆69Updated 3 months ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆185Updated 2 years ago
- Materials for ConceptARC paper☆77Updated 2 weeks ago
- Inference code for LLaMA models in JAX☆113Updated 6 months ago
- ☆508Updated 9 months ago
- ☆197Updated 4 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- ☆98Updated 3 months ago
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆284Updated 2 months ago
- ☆207Updated 6 months ago
- ☆253Updated 8 months ago
- ☆76Updated 9 months ago
- An interactive exploration of Transformer programming.☆246Updated last year
- Mechanistic Interpretability Visualizations using React☆198Updated 4 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆63Updated 2 years ago
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- ☆127Updated 10 months ago