bitextor / warc2text

Extracts plain text, language identification and more metadata from WARC records
20Updated last month

Related projects: