carlosplanchon / betterhtmlchunkingView on GitHub
BetterHTMLChunking is a Python library for intelligent HTML segmentation. It builds a DOM tree from raw HTML and extracts content-rich regions of interest, making content analysis effortless. Great for LLM based processing.
56Mar 7, 2026Updated last month

Alternatives and similar repositories for betterhtmlchunking

Users that are interested in betterhtmlchunking are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?