google-research-datasets / common-crawl-domain-namesView on GitHub
Corpus of domain names scraped from Common Crawl and manually annotated to add word boundaries (e.g. "commoncrawl" to "common crawl").
20Jun 16, 2025Updated 9 months ago

Alternatives and similar repositories for common-crawl-domain-names

Users that are interested in common-crawl-domain-names are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?