google-research-datasets / common-crawl-domain-namesView on GitHub
Corpus of domain names scraped from Common Crawl and manually annotated to add word boundaries (e.g. "commoncrawl" to "common crawl").
20Jun 16, 2025Updated 10 months ago

Alternatives and similar repositories for common-crawl-domain-names

Users that are interested in common-crawl-domain-names are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?