google-research-datasets / swim-ir

SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
45Updated last year

Alternatives and similar repositories for swim-ir:

Users that are interested in swim-ir are comparing it to the libraries listed below