enjalot / latent-sae
Training code for Sparse Autoencoders on Embedding models
☆35Updated 2 months ago
Alternatives and similar repositories for latent-sae:
Users that are interested in latent-sae are comparing it to the libraries listed below
- An introduction to LLM Sampling☆75Updated 2 months ago
- NLP with Rust for Python 🦀🐍☆61Updated 8 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 11 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆30Updated 2 months ago
- look how they massacred my boy☆63Updated 4 months ago
- ☆48Updated 3 months ago
- Functional Benchmarks and the Reasoning Gap☆82Updated 4 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 10 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 2 months ago
- ☆57Updated 4 months ago
- ☆37Updated 6 months ago
- Chat Markup Language conversation library☆55Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆79Updated 11 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆25Updated 2 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆101Updated last year
- ☆20Updated 3 months ago
- Experiments for efforts to train a new and improved t5☆77Updated 10 months ago
- Latent Large Language Models☆17Updated 5 months ago
- ☆26Updated 4 months ago
- ☆19Updated 4 months ago
- ☆48Updated last year
- utilities for loading and running text embeddings with onnx☆44Updated 6 months ago
- ☆51Updated 5 months ago
- Pre-train Static Word Embeddings☆47Updated 3 weeks ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆29Updated 4 months ago
- ☆9Updated 3 months ago