McGill-NLP/llm2vec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/McGill-NLP/llm2vec)

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

☆1,706

Alternatives and similar repositories for llm2vec

Users that are interested in llm2vec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ContextualAI / gritlm
View on GitHub
Generative Representational Instruction Tuning
☆697Jun 25, 2025Updated last year
jakespringer / echo-embeddings
View on GitHub
☆168Apr 17, 2024Updated 2 years ago
AnswerDotAI / ModernBERT
View on GitHub
Bringing BERT into modernity via both architecture changes and scaling
☆1,703Mar 1, 2026Updated 4 months ago
FlagOpen / FlagEmbedding
View on GitHub
Retrieval and Retrieval-augmented LLMs
☆11,979Apr 22, 2026Updated 3 months ago
embeddings-benchmark / mteb
View on GitHub
MTEB: State-of-the-art evaluation of embeddings across languages and modalities
☆3,368Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,260Jun 17, 2026Updated last month
nomic-ai / contrastors
View on GitHub
Train Models Contrastively in Pytorch
☆798Mar 26, 2025Updated last year
LeeSureman / E5-Retrieval-Reproduction
View on GitHub
Use contrastive learning to train a large language model (LLM) as a retriever
☆12Jul 19, 2024Updated 2 years ago
AnswerDotAI / RAGatouille
View on GitHub
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,943May 17, 2025Updated last year
microsoft / LLM2CLIP
View on GitHub
LLM2CLIP significantly improves already state-of-the-art CLIP models.
☆679Feb 1, 2026Updated 5 months ago
stanford-futuredata / ColBERT
View on GitHub
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
☆3,903Oct 14, 2025Updated 9 months ago
AnswerDotAI / rerankers
View on GitHub
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
☆1,626Dec 20, 2025Updated 7 months ago
castorini / rank_llm
View on GitHub
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆610Updated this week
stanfordnlp / pyreft
View on GitHub
Stanford NLP Python library for Representation Finetuning (ReFT)
☆1,574Mar 5, 2026Updated 4 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆18,920Updated this week
argilla-io / distilabel
View on GitHub
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…
☆3,344Updated this week
huggingface / datatrove
View on GitHub
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
☆3,220Updated this week
lightonai / pylate
View on GitHub
Late Interaction Models Training & Retrieval
☆876Updated this week
MinishLab / model2vec
View on GitHub
Fast State-of-the-Art Static Embeddings
☆2,166Jun 6, 2026Updated last month
huggingface / sentence-transformers
View on GitHub
State-of-the-Art Embeddings, Retrieval, and Reranking
☆18,941Updated this week
meta-pytorch / torchtune
View on GitHub
PyTorch native post-training library
☆5,785Updated this week
princeton-nlp / SimCSE
View on GitHub
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
☆3,655Oct 16, 2024Updated last year
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,170Jan 23, 2026Updated 6 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
xlang-ai / instructor-embedding
View on GitHub
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
☆2,024Jan 15, 2025Updated last year
axolotl-ai-cloud / axolotl
View on GitHub
Go ahead and axolotl questions
☆12,242Updated this week
huggingface / setfit
View on GitHub
Efficient few-shot learning with Sentence Transformers
☆2,777May 26, 2026Updated last month
WhereIsAI / BiLLM
View on GitHub
Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…
☆67Dec 12, 2024Updated last year
jxmorris12 / cde
View on GitHub
code for training & evaluating Contextual Document Embedding models
☆207May 14, 2025Updated last year
castorini / pyserini
View on GitHub
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆2,102Jul 16, 2026Updated last week
huggingface / peft
View on GitHub
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆21,446Updated this week
kongds / E5-V
View on GitHub
E5-V: Universal Embeddings with Multimodal Large Language Models
☆275Dec 10, 2025Updated 7 months ago
xhluca / bm25s
View on GitHub
Fast BM25 search in Python, powered by Numpy and Numba
☆1,746Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,643May 26, 2026Updated last month
SeanLee97 / AnglE
View on GitHub
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
☆573Mar 22, 2026Updated 4 months ago
TIGER-AI-Lab / VLM2Vec
View on GitHub
This repo contains the code for "VLM2Vec / MMEB" [ICLR 2025], "VLM2Vec-V2 / MMEB-V2" [TMLR 2026], and "MMEB-V3" [COLM 2026]
☆668Updated this week
dottxt-ai / outlines
View on GitHub
Structured Outputs
☆15,308Updated this week
vec2text / vec2text
View on GitHub
utilities for decoding deep representations (like sentence embeddings) back to text
☆1,129Dec 27, 2025Updated 6 months ago
texttron / tevatron
View on GitHub
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆742Updated this week
HITsz-TMG / KaLM-Embedding
View on GitHub
Code for KaLM-Embedding models
☆116Jun 30, 2025Updated last year