ContextualAI/gritlm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ContextualAI/gritlm)

ContextualAI / gritlm

Generative Representational Instruction Tuning

☆697

Alternatives and similar repositories for gritlm

Users that are interested in gritlm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

McGill-NLP / llm2vec
View on GitHub
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
☆1,706Apr 4, 2026Updated 3 months ago
texttron / tevatron
View on GitHub
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆742Updated this week
jakespringer / echo-embeddings
View on GitHub
☆168Apr 17, 2024Updated 2 years ago
facebookresearch / dpr-scale
View on GitHub
Scalable training for dense retrieval models.
☆298Jul 2, 2026Updated 3 weeks ago
xlang-ai / BRIGHT
View on GitHub
[ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
☆206Sep 13, 2025Updated 10 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
FlagOpen / FlagEmbedding
View on GitHub
Retrieval and Retrieval-augmented LLMs
☆11,979Apr 22, 2026Updated 3 months ago
castorini / rank_llm
View on GitHub
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆610Updated this week
nlp-uoregon / ullme
View on GitHub
☆20Apr 8, 2025Updated last year
facebookresearch / ReasonIR
View on GitHub
Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".
☆230Jul 2, 2026Updated 3 weeks ago
SeanLee97 / AnglE
View on GitHub
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
☆573Mar 22, 2026Updated 4 months ago
nomic-ai / contrastors
View on GitHub
Train Models Contrastively in Pytorch
☆798Mar 26, 2025Updated last year
embeddings-benchmark / mteb
View on GitHub
MTEB: State-of-the-art evaluation of embeddings across languages and modalities
☆3,368Updated this week
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,260Jun 17, 2026Updated last month
naver / splade
View on GitHub
SPLADE: sparse neural search (SIGIR21, SIGIR22)
☆999May 3, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
AIR-Bench / AIR-Bench
View on GitHub
[ACL 2025] AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
☆167Mar 29, 2026Updated 3 months ago
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,643May 26, 2026Updated last month
Muennighoff / sgpt
View on GitHub
SGPT: GPT Sentence Embeddings for Semantic Search
☆872Feb 17, 2024Updated 2 years ago
stanfordnlp / pyreft
View on GitHub
Stanford NLP Python library for Representation Finetuning (ReFT)
☆1,574Mar 5, 2026Updated 4 months ago
TIGER-AI-Lab / StructLM
View on GitHub
Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)
☆76Oct 19, 2024Updated last year
staoxiao / RetroMAE
View on GitHub
Codebase for RetroMAE and beyond.
☆275Jun 7, 2024Updated 2 years ago
HITsz-TMG / KaLM-Embedding
View on GitHub
Code for KaLM-Embedding models
☆116Jun 30, 2025Updated last year
kongds / E5-V
View on GitHub
E5-V: Universal Embeddings with Multimodal Large Language Models
☆275Dec 10, 2025Updated 7 months ago
beir-cellar / beir
View on GitHub
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆2,252Oct 16, 2025Updated 9 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
luyug / GradCache
View on GitHub
Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint
☆443Mar 26, 2024Updated 2 years ago
uclaml / SPIN
View on GitHub
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,247May 8, 2024Updated 2 years ago
castorini / pyserini
View on GitHub
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆2,102Jul 16, 2026Updated last week
AnswerDotAI / RAGatouille
View on GitHub
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,943May 17, 2025Updated last year
ielab / llm-rankers
View on GitHub
Document Ranking with Large Language Models.
☆210Feb 14, 2026Updated 5 months ago
trapoom555 / Language-Model-STS-CFT
View on GitHub
Improving Text Embedding of Language Models Using Contrastive Fine-tuning
☆64Aug 2, 2024Updated last year
dwzhu-pku / LongEmbed
View on GitHub
LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)
☆148Nov 9, 2024Updated last year
AkariAsai / self-rag
View on GitHub
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…
☆2,410May 25, 2024Updated 2 years ago
stanford-futuredata / ColBERT
View on GitHub
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
☆3,903Oct 14, 2025Updated 9 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
RulinShao / retrieval-scaling
View on GitHub
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
☆226Dec 16, 2025Updated 7 months ago
huggingface / datatrove
View on GitHub
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
☆3,220Updated this week
xlang-ai / instructor-embedding
View on GitHub
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
☆2,024Jan 15, 2025Updated last year
allenai / open-instruct
View on GitHub
AllenAI's post-training codebase
☆3,808Updated this week
AnswerDotAI / rerankers
View on GitHub
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
☆1,626Dec 20, 2025Updated 7 months ago
ContextualAI / HALOs
View on GitHub
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
☆908Sep 30, 2025Updated 9 months ago
XueFuzhao / OpenMoE
View on GitHub
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,691Mar 8, 2024Updated 2 years ago