harish-kamath / rqaeLinks
Residual Quantization Autoencoder, used for interpreting LLMs
☆13Updated 10 months ago
Alternatives and similar repositories for rqae
Users that are interested in rqae are comparing it to the libraries listed below
Sorting:
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆20Updated 3 weeks ago
- ☆44Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆75Updated last year
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆28Updated 3 weeks ago
- Minimum Description Length probing for neural network representations☆20Updated 9 months ago
- ☆75Updated 3 months ago
- ☆39Updated last year
- Google Research☆46Updated 3 years ago
- Aioli: A unified optimization framework for language model data mixing☆28Updated 10 months ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 3 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- ☆36Updated 2 years ago
- ☆43Updated 4 years ago
- ☆52Updated last year
- Understanding how features learned by neural networks evolve throughout training☆39Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Updated 2 years ago
- A repository for transformer critique learning and generation☆89Updated last year
- ☆76Updated last year
- ☆13Updated 5 months ago
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Updated 9 months ago
- 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Updated 7 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Updated 7 months ago
- ☆24Updated 7 months ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Updated last year
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆44Updated last month
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Updated 6 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆87Updated last year
- A library to create and manage configuration files, especially for machine learning projects.☆80Updated 3 years ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated last year