matthewwithanm / python-markdownify
Convert HTML to Markdown
☆1,302Updated this week
Alternatives and similar repositories for python-markdownify:
Users that are interested in python-markdownify are comparing it to the libraries listed below
- Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed. Now in Python!☆794Updated this week
- A markdown parser with high extensibility.☆374Updated this week
- Thin wrapper for "pandoc" (MIT)☆926Updated this week
- A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG☆348Updated 5 months ago
- Python bindings to PDFium☆495Updated this week
- AI chat and search for text, news, images and videos using the DuckDuckGo.com search engine.☆1,307Updated this week
- Convert HTML to Markdown-formatted text.☆1,876Updated 6 months ago
- A Python library to access ISO country, subdivision, language, currency and script definitions and their translations.☆803Updated this week
- A tool to turn Markdown into a nested JSON structure.☆307Updated 4 months ago
- A fast, extensible and spec-compliant Markdown parser in pure Python.☆863Updated 3 months ago
- A simple HTML content extractor in Python. Can be run as a wrapper for Mozilla's Readability.js package or in pure-python mode.☆256Updated last month
- Fuzzy String Matching in Python☆3,027Updated 11 months ago
- Demos, examples and utilities using PyMuPDF☆618Updated 6 months ago
- A rate limiter for Starlette and FastAPI☆1,334Updated 2 weeks ago
- 🎭 Playwright integration for Scrapy☆1,085Updated 2 months ago
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…☆3,864Updated last month
- Parsing JavaScript objects into Python data structures☆201Updated 3 weeks ago
- Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).☆1,199Updated 3 weeks ago
- pgvector support for Python☆1,060Updated last month
- A python module to repair invalid JSON, commonly used to parse the output of LLMs☆1,409Updated 3 weeks ago
- Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors☆1,174Updated this week
- Production-grade retries for Python☆1,009Updated this week
- Display tabular data in a visually appealing ASCII table format☆1,420Updated this week
- Parse and manage posts with YAML (or other) frontmatter☆346Updated last year
- Extract structured text from pdfs quickly☆393Updated this week
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF☆728Updated 2 months ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆235Updated last week
- A request rate limiter for fastapi☆532Updated 9 months ago
- fastapi-cache is a tool to cache fastapi response and function result, with backends support redis and memcached.☆1,446Updated this week
- Rapid fuzzy string matching in Python using various string metrics☆2,853Updated this week