Alibaba-NLP/Multi-CPR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Alibaba-NLP/Multi-CPR)

Alibaba-NLP / Multi-CPR

[SIGIR 2022] Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

☆204

Alternatives and similar repositories for Multi-CPR

Users that are interested in Multi-CPR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

THUIR / T2Ranking
View on GitHub
T2Ranking: A large-scale Chinese benchmark for passage ranking.
☆161Jul 3, 2023Updated 3 years ago
bugensui / WenTianSearch
View on GitHub
“阿里灵杰”问天引擎电商搜索算法赛 13/2771
☆10Jul 31, 2022Updated 3 years ago
zwkkk / wentian-rank2
View on GitHub
“阿里灵杰”问天引擎电商搜索算法赛第二名。电商领域两阶段文本匹配算法。
☆56Jul 28, 2022Updated 3 years ago
Alibaba-NLP / HLATR
View on GitHub
Hybrid List Aware Transformer Reranking
☆19Oct 25, 2022Updated 3 years ago
muyuuuu / E-commerce-Search-Recall
View on GitHub
天池阿里灵杰问天引擎电商搜索算法赛非官方 baseline，又名 NLP 从入门到 22/2771。
☆93Jun 29, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
PaddlePaddle / RocketQA
View on GitHub
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models…
☆785Dec 19, 2023Updated 2 years ago
AlibabaResearch / HLATR
View on GitHub
Implementation of paper: HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking
☆74Jan 4, 2023Updated 3 years ago
bojone / SimCSE
View on GitHub
SimCSE在中文任务上的简单实验
☆605Aug 7, 2023Updated 2 years ago
yhao-wang / LLM-Knowledge-Boundary
View on GitHub
Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"
☆21Jul 31, 2023Updated 2 years ago
CLUEbenchmark / QBQTC
View on GitHub
QBQTC: 大规模搜索匹配数据集
☆86Dec 12, 2021Updated 4 years ago
luyug / Condenser
View on GitHub
EMNLP 2021 - Pre-training architectures for dense retrieval
☆256Mar 18, 2022Updated 4 years ago
princeton-nlp / SimCSE
View on GitHub
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
☆3,655Oct 16, 2024Updated last year
OpenMatch / ANCE-Tele
View on GitHub
Code and data of the EMNLP 2022 Main Conference paper "Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Nega…
☆18Mar 25, 2024Updated 2 years ago
microsoft / SEED-Encoder
View on GitHub
☆45Oct 14, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / DPR
View on GitHub
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
☆1,868Apr 6, 2023Updated 3 years ago
CLUEbenchmark / SimCLUE
View on GitHub
3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型
☆313Oct 11, 2022Updated 3 years ago
texttron / tevatron
View on GitHub
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆742Updated this week
baidu / DuReader
View on GitHub
Baseline Systems of DuReader Dataset
☆1,178May 26, 2022Updated 4 years ago
FreedomIntelligence / DPTDR
View on GitHub
Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval
☆26Aug 7, 2023Updated 2 years ago
JetRunner / LaPraDoR
View on GitHub
🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…
☆49Apr 25, 2022Updated 4 years ago
sebastian-hofstaetter / tas-balanced-dense-retrieval
View on GitHub
SIGIR 2021: Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling
☆60Jul 11, 2021Updated 5 years ago
dropreg / R-Drop
View on GitHub
☆880May 24, 2024Updated 2 years ago
Macielyoung / sentence_representation_matching
View on GitHub
句子匹配模型，包括无监督的SimCSE、ESimCSE、PromptBERT，和有监督的SBERT、CoSENT。
☆98Oct 29, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
SmartLi8 / stella
View on GitHub
text embedding
☆145Sep 18, 2023Updated 2 years ago
pluto-junzeng / ChineseSquad
View on GitHub
中文机器阅读理解数据集
☆108Mar 29, 2021Updated 5 years ago
wangyuxinwhy / uniem
View on GitHub
unified embedding model
☆876Sep 1, 2023Updated 2 years ago
xinyi-code / SimCSE-Pytorch
View on GitHub
中文数据集下SimCSE+ESimCSE的实现
☆190May 21, 2022Updated 4 years ago
RUCAIBox / DenseRetrieval
View on GitHub
☆220Dec 7, 2022Updated 3 years ago
drogozhang / LED
View on GitHub
Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)
☆22Aug 28, 2023Updated 2 years ago
Sanster / global_pointer
View on GitHub
☆13Jun 20, 2022Updated 4 years ago
microsoft / MSMARCO-Passage-Ranking
View on GitHub
MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …
☆343Jun 12, 2023Updated 3 years ago
unicamp-dl / mMARCO
View on GitHub
A multilingual version of MS MARCO passage ranking dataset
☆148Oct 19, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zejunwang1 / bert4vec
View on GitHub
一个基于预训练的句向量生成工具
☆138Mar 30, 2023Updated 3 years ago
jingtaozhan / disentangled-retriever
View on GitHub
An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.
☆60May 17, 2023Updated 3 years ago
RUC-GSAI / YuLan-IR
View on GitHub
YuLan-IR: Information Retrieval Boosted LMs
☆220Mar 4, 2024Updated 2 years ago
beir-cellar / beir
View on GitHub
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆2,243Oct 16, 2025Updated 9 months ago
AdeDZY / DeepCT
View on GitHub
DeepCT and HDCT uses BERT to generate novel, context-aware bag-of-words term weights for documents and queries.
☆325May 9, 2021Updated 5 years ago
ielab / asyncval
View on GitHub
A toolkit for asynchronously validating dense retriever checkpoints during training.
☆27Aug 10, 2023Updated 2 years ago
benywon / ReCO
View on GitHub
ReCO: A Large Scale Chinese Reading Comprehension Dataset on Opinion
☆37Jul 25, 2024Updated last year