MinishLab / model2vec
Fast State-of-the-Art Static Embeddings
☆1,088Updated last week
Alternatives and similar repositories for model2vec:
Users that are interested in model2vec are comparing it to the libraries listed below
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆750Updated last month
- Fast Semantic Text Deduplication☆564Updated last week
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆1,049Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,327Updated 2 weeks ago
- Things you can do with the token embeddings of an LLM☆1,427Updated last month
- Synthetic data curation for post-training and structured data extraction☆947Updated this week
- Bringing BERT into modernity via both architecture changes and scaling☆1,262Updated 3 weeks ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆844Updated last month
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,264Updated 3 weeks ago
- LOTUS: A semantic query engine for fast and easy LLM-powered data processing☆1,117Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆260Updated 2 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆1,844Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,546Updated this week
- High-performance retrieval engine for unstructured data☆1,217Updated this week
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite☆861Updated last week
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆1,582Updated this week
- ☆618Updated 3 months ago
- 🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library☆2,737Updated this week
- Everything about the SmolLM2 and SmolVLM family of models☆1,995Updated 3 weeks ago
- Late Interaction Models Training & Retrieval☆254Updated this week
- Lightweight Nearest Neighbors with Flexible Backends☆258Updated last week
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…☆903Updated last month
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.☆2,124Updated this week
- ☆682Updated this week
- Build datasets using natural language☆423Updated last week
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆410Updated last year
- Generate large synthetic data using an LLM☆389Updated this week
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cro…☆766Updated 3 months ago
- Optimizing inference proxy for LLMs☆2,091Updated this week
- ☆207Updated 8 months ago