enjalot / latent-sae
Training code for Sparse Autoencoders on Embedding models
☆35Updated last month
Alternatives and similar repositories for latent-sae:
Users that are interested in latent-sae are comparing it to the libraries listed below
- NLP with Rust for Python 🦀🐍☆60Updated 7 months ago
- An introduction to LLM Sampling☆75Updated last month
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 10 months ago
- ☆48Updated last year
- Chat Markup Language conversation library☆55Updated last year
- utilities for loading and running text embeddings with onnx☆40Updated 5 months ago
- ☆27Updated 3 months ago
- Using modal.com to process FineWeb-edu data☆19Updated last month
- look how they massacred my boy☆63Updated 3 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆79Updated 10 months ago
- ☆22Updated last year
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆25Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆49Updated 10 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆97Updated 10 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆29Updated 3 months ago
- ☆79Updated last week
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆26Updated last month
- ☆46Updated 2 months ago
- ☆37Updated 5 months ago
- Training hybrid models for dummies.☆16Updated last month
- ☆20Updated 2 months ago
- ☆62Updated 3 months ago
- Functional Benchmarks and the Reasoning Gap☆82Updated 3 months ago
- ☆49Updated 4 months ago
- ☆49Updated 10 months ago
- Lightweight tools for quick and easy LLM demo's☆26Updated 3 months ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 9 months ago
- Tools to make language models a bit easier to use☆32Updated last month