MinishLab/model2vec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MinishLab/model2vec)

MinishLab / model2vec

Fast State-of-the-Art Static Embeddings

☆2,158

Alternatives and similar repositories for model2vec

Users that are interested in model2vec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MinishLab / tokenlearn
View on GitHub
Pre-train Static Word Embeddings
☆108Jun 9, 2026Updated last month
MinishLab / vicinity
View on GitHub
Lightweight Nearest Neighbors with Flexible Backends
☆347May 24, 2026Updated last month
MinishLab / semhash
View on GitHub
Fast Multimodal Semantic Deduplication & Filtering
☆945May 24, 2026Updated last month
lightonai / pylate
View on GitHub
Late Interaction Models Training & Retrieval
☆875Jul 13, 2026Updated last week
AnswerDotAI / rerankers
View on GitHub
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
☆1,624Dec 20, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xhluca / bm25s
View on GitHub
Fast BM25 search in Python, powered by Numpy and Numba
☆1,738Jul 7, 2026Updated last week
dleemiller / WordLlama
View on GitHub
Things you can do with the token embeddings of an LLM
☆1,450Dec 1, 2025Updated 7 months ago
MinishLab / model2vec-rs
View on GitHub
Official Rust Implementation of Model2Vec
☆197May 24, 2026Updated last month
dottxt-ai / outlines
View on GitHub
Structured Outputs
☆14,547Updated this week
AnswerDotAI / RAGatouille
View on GitHub
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,939May 17, 2025Updated last year
urchade / GLiNER
View on GitHub
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts)
☆3,403Updated this week
koaning / embetter
View on GitHub
just a bunch of useful embeddings for scikit-learn pipelines
☆527Feb 12, 2026Updated 5 months ago
qdrant / fastembed
View on GitHub
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
☆3,092Updated this week
huggingface / setfit
View on GitHub
Efficient few-shot learning with Sentence Transformers
☆2,772May 26, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
predibase / lorax
View on GitHub
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
☆3,816May 28, 2026Updated last month
neuml / staticvectors
View on GitHub
🔢 Work with static vector models
☆39Apr 21, 2025Updated last year
stephantul / pynife
View on GitHub
Nearly Inference Free Embeddings: make your RAG queries 500x faster
☆80Apr 27, 2026Updated 2 months ago
argilla-io / distilabel
View on GitHub
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…
☆3,332Jul 13, 2026Updated last week
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆36,221Updated this week
stanford-futuredata / ColBERT
View on GitHub
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
☆3,902Oct 14, 2025Updated 9 months ago
neuml / txtai
View on GitHub
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
☆12,733Updated this week
michaelfeil / infinity
View on GitHub
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
☆2,888Mar 24, 2026Updated 3 months ago
Pringled / agentcheck
View on GitHub
Check what an AI agent can access before you run it
☆27Mar 8, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AnswerDotAI / ModernBERT
View on GitHub
Bringing BERT into modernity via both architecture changes and scaling
☆1,700Mar 1, 2026Updated 4 months ago
argilla-io / argilla
View on GitHub
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
☆5,038Jul 13, 2026Updated last week
Lightning-AI / LitServe
View on GitHub
A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.
☆3,920Jul 6, 2026Updated 2 weeks ago
davidberenstein1957 / fast-sentence-transformers
View on GitHub
Simply, faster, sentence-transformers
☆144Aug 27, 2024Updated last year
huggingface / text-embeddings-inference
View on GitHub
A blazing fast inference solution for text embeddings models
☆4,943Updated this week
transformerlab / transformerlab-app
View on GitHub
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU cluste…
☆5,162Updated this week
Lightning-AI / litgpt
View on GitHub
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
☆13,490Jul 6, 2026Updated 2 weeks ago
huggingface / datatrove
View on GitHub
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
☆3,210Updated this week
mixedbread-ai / batched
View on GitHub
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆161Jul 14, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,243Jun 17, 2026Updated last month
tjmlabs / ColiVara
View on GitHub
Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…
☆1,483Jul 8, 2026Updated last week
huggingface / smol-course
View on GitHub
A course on aligning smol models.
☆6,677May 26, 2026Updated last month
McGill-NLP / llm2vec
View on GitHub
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
☆1,703Apr 4, 2026Updated 3 months ago
axolotl-ai-cloud / axolotl
View on GitHub
Go ahead and axolotl questions
☆12,215Updated this week
roboflow / maestro
View on GitHub
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
☆2,686Jul 13, 2026Updated last week
stephantul / skeletoken
View on GitHub
Datamodels for hugging face tokenizers
☆108Jun 18, 2026Updated last month