google-research / deduplicate-text-datasets
☆1,206Updated 8 months ago
Alternatives and similar repositories for deduplicate-text-datasets:
Users that are interested in deduplicate-text-datasets are comparing it to the libraries listed below
- All-in-one text de-duplication☆671Updated 11 months ago
- ☆1,512Updated last week
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆727Updated 2 years ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆1,784Updated 2 months ago
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.☆1,812Updated this week
- SGPT: GPT Sentence Embeddings for Semantic Search