Alibaba-NLP / Multi-CPR
[SIGIR 2022] Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval
☆169Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Multi-CPR
- T2Ranking: A large-scale Chinese benchmark for passage ranking.☆151Updated last year
- CoSENT、STS、SentenceBERT☆162Updated last year
- text embedding☆139Updated last year
- 收集了目前为止中文领域的MRC抽取式数据集☆119Updated 5 months ago
- 3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型☆280Updated 2 years ago
- 真 · “Deep Learning for Humans”☆140Updated 2 years ago
- 中文 Instruction tuning datasets☆118Updated 7 months ago
- 中文数据集下SimCSE+ESimCSE的实现☆189Updated 2 years ago
- A framework for cleaning Chinese dialog data☆261Updated 3 years ago
- P-tuning方法在中文上的简单实验☆138Updated 3 years ago
- SimCSE有监督与无监督实验复现☆145Updated 8 months ago
- NLU & NLG (zero-shot) depend on mengzi-t5-base-mt pretrained model☆75Updated 2 years ago
- Pattern-Exploiting Training在中文上的简单实验☆170Updated 4 years ago
- experiments of some semantic matching models and comparison of experimental results.☆159Updated last year
- NLP句子编码、句子embedding、语义相似度:BERT_avg、BERT_whitening、SBERT、SmiCSE☆173Updated 2 years ago
- RoFormer升级版☆149Updated 2 years ago
- 句子匹配模型,包括无监督的SimCSE、ESimCSE、PromptBERT,和有监督的SBERT、CoSENT。☆97Updated 2 years ago
- 中文NLP数据集☆151Updated 5 years ago
- 中文bigbird预训练模型☆89Updated 2 years ago
- SimBERT升级版(SimBERTv2)!☆438Updated 2 years ago
- 中文自然语言推理数据集(A large-scale Chinese Nature language inference and Semantic similarity calculation Dataset)☆425Updated 4 years ago
- 一个基于预训练的句向量生成工具☆132Updated last year
- 中文图书语料MD5链接☆211Updated 9 months ago
- Baichuan-13B 指令微调☆89Updated last year
- OCNLI: 中文原版自然语言推理任务☆146Updated 3 years ago
- SimCSE在中文上的复现,有监督+无监督☆266Updated 2 years ago
- chinese version of longformer☆111Updated 4 years ago
- A simple framework for building some basic NLP tasks☆59Updated 2 years ago
- 中文机器阅读理解数据集☆100Updated 3 years ago
- ☆277Updated 2 years ago