huggingface / OBELICSLinks

Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M documents, 115B text tokens and 353M images.
205Updated 10 months ago

Alternatives and similar repositories for OBELICS

Users that are interested in OBELICS are comparing it to the libraries listed below

Sorting: