adbar / trafilaturaView on GitHub
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
6,131Jun 13, 2026Updated this week

Alternatives and similar repositories for trafilatura

Users that are interested in trafilatura are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?