huggingface / OBELICSLinks

Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M documents, 115B text tokens and 353M images.
202Updated 9 months ago

Alternatives and similar repositories for OBELICS

Users that are interested in OBELICS are comparing it to the libraries listed below

Sorting: