Bond1995 / Markov
Code for experiments on transformers using Markovian data.
☆11Updated 5 months ago
Alternatives and similar repositories for Markov:
Users that are interested in Markov are comparing it to the libraries listed below
- Deep Networks Grok All the Time and Here is Why☆34Updated 11 months ago
- ☆18Updated 9 months ago
- Efficient Scaling laws and collaborative pretraining.☆16Updated 2 months ago
- Universal Neurons in GPT2 Language Models☆27Updated 10 months ago
- ☆31Updated 3 months ago
- Source code for the paper "Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning"☆14Updated 2 months ago
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆11Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆10Updated 3 weeks ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆27Updated 7 months ago
- Code for "Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations"☆23Updated 2 years ago
- ☆32Updated 6 months ago
- ☆31Updated 6 months ago
- ☆49Updated last year
- Code for the paper "Function-Space Learning Rates"☆19Updated last week
- ☆37Updated last year
- ☆31Updated 11 months ago
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆40Updated last year
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆17Updated last month
- ☆13Updated 2 years ago
- Official code for the paper "Attention as a Hypernetwork"☆28Updated 10 months ago
- ☆26Updated last year
- ☆15Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆36Updated 2 years ago
- ☆22Updated 2 months ago
- ☆18Updated last month
- ☆31Updated last year
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆22Updated last year
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Updated last year
- Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated last month
- ☆30Updated 5 months ago