rahul13ramesh / compositional_capabilities
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
☆10Updated 9 months ago
Alternatives and similar repositories for compositional_capabilities:
Users that are interested in compositional_capabilities are comparing it to the libraries listed below
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆11Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆36Updated 2 years ago
- [ICML'21 Oral] Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding☆14Updated 3 years ago
- Very deep VAEs in JAX/Flax☆46Updated 3 years ago
- Codebase for Mechanistic Mode Connectivity☆13Updated last year
- ☆33Updated 2 years ago
- General Invertible Transformations for Flow-based Generative Models☆17Updated 4 years ago
- Deep Networks Grok All the Time and Here is Why☆34Updated 11 months ago
- Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch☆24Updated 4 years ago
- ☆41Updated 2 years ago
- Experiments for Meta-Learning Symmetries by Reparameterization☆56Updated 4 years ago
- ☆17Updated 2 years ago
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆27Updated 3 years ago
- ☆53Updated 9 months ago
- AdaCat☆49Updated 2 years ago
- ☆19Updated 3 years ago
- Open source code for paper "On the Learning and Learnability of Quasimetrics".☆32Updated 2 years ago
- Blog post☆17Updated last year
- Code associated with our paper "Learning Group Structure and Disentangled Representations of Dynamical Environments"☆15Updated 2 years ago
- ☆25Updated 2 years ago
- Transformers with doubly stochastic attention☆45Updated 2 years ago
- SGD with large step sizes learns sparse features [ICML 2023]☆32Updated 2 years ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆59Updated 3 years ago
- ☆52Updated 6 months ago
- ☆51Updated 10 months ago
- code for "Semi-Discrete Normalizing Flows through Differentiable Tessellation"☆26Updated 2 years ago
- Code from the article: "The Role of Disentanglement in Generalisation" (ICLR, 2021).☆22Updated 2 years ago
- ☆32Updated 6 months ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆44Updated 2 years ago
- Euclidean Wasserstein-2 optimal transportation☆47Updated last year