enjalot / latent-saeView external linksLinks
Training code for Sparse Autoencoders on Embedding models
☆39Feb 27, 2025Updated 11 months ago
Alternatives and similar repositories for latent-sae
Users that are interested in latent-sae are comparing it to the libraries listed below
Sorting:
- Using modal.com to process FineWeb-edu data☆20Apr 5, 2025Updated 10 months ago
- An introduction to LLM Sampling☆79Dec 15, 2024Updated last year
- User-friendly viewer for Parquet files☆10Jan 10, 2026Updated last month
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 2 months ago
- ☆20Nov 18, 2024Updated last year
- MEXMA: Token-level objectives improve sentence representations☆42Jan 6, 2025Updated last year
- A collection of tools for your LLMs that run on Modal☆23Feb 28, 2025Updated 11 months ago
- Sparsify transformers with SAEs and transcoders☆692Feb 9, 2026Updated last week
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated 8 months ago
- Finetune your embeddings in-browser☆34Apr 14, 2024Updated last year
- ☆57Jan 26, 2025Updated last year
- ☆14Jul 7, 2024Updated last year
- ☆16Jun 19, 2023Updated 2 years ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- Sparse Autoencoder Training Library☆56May 1, 2025Updated 9 months ago
- Sampling-Based Minimum Bayes-Risk Decoding for Neural Machine Translation☆16Oct 14, 2022Updated 3 years ago
- Code for the paper "Multi-Field Adaptive Retrieval," a research project on a semi-structured document retrieval☆15Updated this week
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.☆18Dec 19, 2024Updated last year
- The training codes of Jasper-Token-Compression-600M☆19Nov 19, 2025Updated 2 months ago
- ☆25Oct 27, 2025Updated 3 months ago
- code for training & evaluating Contextual Document Embedding models☆202May 14, 2025Updated 9 months ago
- Simple Transformer in Jax☆142Jun 22, 2024Updated last year
- Training hybrid models for dummies.☆29Nov 1, 2025Updated 3 months ago
- utilities for loading and running text embeddings with onnx☆45Aug 16, 2025Updated 6 months ago
- Pre-train Static Word Embeddings☆94Sep 9, 2025Updated 5 months ago
- A high performance batching router optimises max throughput for text inference workload☆16Sep 6, 2023Updated 2 years ago
- Multimodal extreme classification☆20May 1, 2024Updated last year
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16May 3, 2022Updated 3 years ago
- A single static file as vector database, using the cloud-native flatgeobuf file format and http range requests☆17Oct 28, 2025Updated 3 months ago
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Jun 3, 2024Updated last year
- Contextualized per-token embeddings☆34May 11, 2025Updated 9 months ago
- FormFill is a CLI tool that uses LLMs to automatically fill out PDF forms.☆29Nov 22, 2024Updated last year
- ☆21Mar 28, 2024Updated last year
- Word sense disambiguation test sets for NMT☆20Dec 3, 2020Updated 5 years ago
- Late Interaction Models Training & Retrieval☆711Updated this week
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆156Jul 14, 2025Updated 7 months ago
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆49Nov 6, 2024Updated last year
- ☆47Mar 27, 2022Updated 3 years ago
- Simplified implementation of UMAP like dimensionality reduction algorithm☆53Nov 18, 2024Updated last year