mwatkins1970 / SAE_Feature_Interpretability_Tool
A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom on Gemma-2B).
☆19Updated 6 months ago
Alternatives and similar repositories for SAE_Feature_Interpretability_Tool:
Users that are interested in SAE_Feature_Interpretability_Tool are comparing it to the libraries listed below
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆40Updated 2 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆21Updated 5 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 2 weeks ago
- ☆13Updated 4 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆43Updated 7 months ago
- alternative way to calculating self attention☆18Updated 11 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 9 months ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Updated 9 months ago
- Latent Large Language Models☆17Updated 8 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- Problem-Oriented Segmentation and Retrieval EMNLP 2024 Findings☆31Updated 5 months ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆89Updated 9 months ago
- Enhancement in Multimodal Representation Learning.☆40Updated last year
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated last month
- Repository to create traveling waves integrate special information through time☆50Updated last month
- ☆20Updated 4 months ago
- XmodelLM☆39Updated 5 months ago
- Training hybrid models for dummies.☆20Updated 3 months ago
- ☆16Updated last month
- ☆48Updated 5 months ago
- ☆63Updated 7 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆33Updated last month
- The first dense retrieval model that can be prompted like an LM☆71Updated 7 months ago
- This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file☆14Updated 8 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- BH hackathon☆14Updated last year
- ☆11Updated last year
- ☆51Updated 5 months ago
- Training code for Sparse Autoencoders on Embedding models☆38Updated last month
- The official implementation of Cross-Task Experience Sharing (COPS)☆22Updated 6 months ago