naver/bergen

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/naver/bergen)

naver / bergen

Benchmarking library for RAG

☆276

Alternatives and similar repositories for bergen

Users that are interested in bergen are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Marker-Inc-Korea / AutoRAG-example-korean-embedding-benchmark
View on GitHub
AutoRAG example about benchmarking Korean embeddings.
☆46Oct 2, 2024Updated last year
ielab / llm-rankers
View on GitHub
Document Ranking with Large Language Models.
☆210Feb 14, 2026Updated 5 months ago
mjeensung / xtr-pytorch
View on GitHub
☆19May 16, 2024Updated 2 years ago
TusKANNy / seismic
View on GitHub
Official repository of the Seismic library.
☆135Jul 6, 2026Updated 2 weeks ago
naver / splade
View on GitHub
SPLADE: sparse neural search (SIGIR21, SIGIR22)
☆999May 3, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
castorini / rank_llm
View on GitHub
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆610Updated this week
hseb-benchmark / hseb
View on GitHub
HSEB: Hybrid Search Engine Benchmark
☆21Oct 5, 2025Updated 9 months ago
recombee / CompresSAE
View on GitHub
Sparse Embedding Compression for Scalable Retrieval in Recommender Systems
☆39Nov 21, 2025Updated 8 months ago
castorini / pyserini
View on GitHub
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆2,102Jul 16, 2026Updated last week
thakur-nandan / sprint
View on GitHub
SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.
☆48Jul 25, 2023Updated 3 years ago
nlpai-lab / KURE
View on GitHub
KURE: 고려대학교에서 개발한, 한국어 검색에 특화된 임베딩 모델
☆225Apr 14, 2026Updated 3 months ago
instructkr / reranker-simple-benchmark
View on GitHub
Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.
☆35Dec 2, 2025Updated 7 months ago
zetaalphavector / InPars
View on GitHub
Inquisitive Parrots for Search
☆200Jun 5, 2025Updated last year
TusKANNy / awesome-learned-sparse-retrieval
View on GitHub
An extensive and commented list of resources on Learned Sparse Retrieval.
☆63Updated this week
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
liuqi6777 / llm4ranking
View on GitHub
Large language models for document ranking.
☆75May 20, 2026Updated 2 months ago
lightonai / pylate
View on GitHub
Late Interaction Models Training & Retrieval
☆876Updated this week
vsahil / MIMETIC-2
View on GitHub
Official Code for MIMETIC^2
☆13Nov 19, 2024Updated last year
ssisOneTeam / Korean-Embedding-Model-Performance-Benchmark-for-Retriever
View on GitHub
Korean Sentence Embedding Model Performance Benchmark for RAG
☆49Jan 27, 2025Updated last year
thongnt99 / learned-sparse-retrieval
View on GitHub
Unified Learned Sparse Retrieval Framework
☆68May 13, 2024Updated 2 years ago
project-miracl / miracl
View on GitHub
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
☆211Jul 31, 2024Updated last year
stephantul / pynife
View on GitHub
Nearly Inference Free Embeddings: make your RAG queries 500x faster
☆80Apr 27, 2026Updated 2 months ago
AIR-Bench / AIR-Bench
View on GitHub
[ACL 2025] AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
☆167Mar 29, 2026Updated 3 months ago
Furyton / GR-as-MVDR
View on GitHub
[SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrieval
☆36Oct 18, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
allenai / ir_datasets
View on GitHub
Provides a common interface to many IR ranking datasets.
☆390May 28, 2026Updated last month
castorini / ragnarok
View on GitHub
Retrieval-Augmented Generation battle!
☆66Apr 18, 2026Updated 3 months ago
google-research-datasets / swim-ir
View on GitHub
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆50Nov 13, 2023Updated 2 years ago
terrierteam / pyterrier_adaptive
View on GitHub
☆18Jun 16, 2026Updated last month
texttron / tevatron
View on GitHub
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆743Jul 18, 2026Updated last week
facebookresearch / mexma
View on GitHub
MEXMA: Token-level objectives improve sentence representations
☆43Jan 6, 2025Updated last year
daekeun-ml / evaluate-llm-on-korean-dataset
View on GitHub
Performs benchmarking on two Korean datasets with minimal time and effort.
☆45Jan 22, 2026Updated 6 months ago
gangiswag / llm-reranker
View on GitHub
☆63Jan 26, 2025Updated last year
HansiZeng / PAG
View on GitHub
[SIGIR 2024] The official repo for paper "Planning Ahead in Generative Retrieval: Guiding Autoregressive Generation through Simultaneous …
☆32Apr 24, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ielab / Starbucks
View on GitHub
Starbucks: Improved Training for 2D Matryoshka Embeddings
☆25Jun 30, 2025Updated last year
hltcoe / ColBERT-X
View on GitHub
CLIR version of ColBERT
☆73May 28, 2026Updated last month
DunZhang / Jasper-Token-Compression-Training
View on GitHub
The training codes of Jasper-Token-Compression-600M
☆20Nov 19, 2025Updated 8 months ago
DSBA-Lab / Contrastive-Accumulation
View on GitHub
☆14Jul 7, 2024Updated 2 years ago
evgeniiaraz / datasets_multiling_dialogue
View on GitHub
Multilingual Dialogue Datasets
☆19Aug 18, 2022Updated 3 years ago
EMNLP-2024-CritiCS / Collective-Critics-for-Creative-Story-Generation
View on GitHub
☆14Jan 10, 2025Updated last year
beir-cellar / beir
View on GitHub
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆2,252Oct 16, 2025Updated 9 months ago