enjalot / latent-sae
Training code for Sparse Autoencoders on Embedding models
โ27Updated last week
Related projects: โ
- NLP with Rust for Python ๐ฆ๐โ57Updated 3 months ago
- โ58Updated 3 weeks ago
- Late Interaction Models Training & Retrievalโ130Updated last week
- โ38Updated this week
- minimal pytorch implementation of bm25 (with sparse tensors)โ82Updated 6 months ago
- โ56Updated this week
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).โ73Updated 6 months ago
- โ29Updated 3 weeks ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.โ13Updated last week
- โ48Updated 11 months ago
- Lightweight tools for quick and easy LLM demo'sโ22Updated 2 weeks ago
- โ68Updated last month
- Repository containing the SPIN experiments on the DIBT 10k ranked promptsโ22Updated 6 months ago
- utilities for loading and running text embeddings with onnxโ39Updated last month
- Functional Benchmarks and the Reasoning Gapโ74Updated last month
- โ68Updated 2 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.โ18Updated 2 months ago
- โ43Updated 7 months ago
- Chat Markup Language conversation libraryโ53Updated 8 months ago
- โ22Updated last year
- A miniature version of Modalโ18Updated 3 months ago
- ๐ Reference-Free automatic summarization evaluation with potential hallucination detectionโ99Updated 8 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationโ33Updated 6 months ago
- โ91Updated last month
- โ23Updated 2 weeks ago
- Using modal.com to process FineWeb-edu dataโ18Updated last week
- Aidan Bench attempts to measure <big_model_smell> in LLMs.โ64Updated this week
- โ75Updated 3 weeks ago
- โ62Updated 5 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top ofโฆโ73Updated last month