chuanconggao / html2json
Lightweight library that converts a HTML webpage to JSON data using a template defined in JSON.
☆22Updated this week
Alternatives and similar repositories for html2json:
Users that are interested in html2json are comparing it to the libraries listed below
- Analyze scraped data☆46Updated 5 years ago
- https://mimesniff.spec.whatwg.org/ implementation for Python☆13Updated last year
- Asyncio web crawling framework. Work in progress.☆18Updated 8 months ago
- Flask App - Argon Design System | AppSeed☆11Updated 4 years ago
- Python library for modern thread / multiprocessing pooling and task processing via asyncio☆15Updated 4 years ago
- Python library for finding phone numbers in random user input text.☆8Updated 7 years ago
- Python module for Named Entity Recognition (NER) using natural language processing.☆13Updated 3 years ago
- Scrapy middleware which allows to crawl only new content☆80Updated 2 years ago
- A fork of http://pydispatcher.sourceforge.net/ with PyPy support☆16Updated 7 years ago
- Assorted generic flask views, blueprints, Jinja2 filters, macros, forms and more.☆24Updated 5 years ago
- extract difference between two html pages☆32Updated 6 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- ☕🗄CAching Proxy in Python – Simple file based python http proxy☆15Updated 3 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- AnyAPI is a library that helps you to write any API wrapper with ease and in pythonic way.☆132Updated 3 years ago
- Generate random date(time) in Python.☆10Updated last year
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 3 years ago
- 🏃♀️ Minimalistic CLI Tool for Managing and Running Bash Snippets☆37Updated 5 years ago
- Scrapy spider middleware to clean up query parameters in request URLs☆24Updated 8 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 3 years ago
- Python 3 library for reading and writing warc files☆20Updated 7 years ago
- Scraper for categories and lists on ecommerce and other listing websites☆42Updated 4 years ago
- A Flask-powered Kanban app☆10Updated 4 years ago
- A python instagram scraper which uses BeautifulSoup and JSON to scrape public instagram accounts☆27Updated 7 years ago
- Tools to easy generate RSS feed that contains each scraped item using Scrapy framework.☆33Updated 5 months ago
- Scrapy entrypoint for Scrapinghub job runner☆26Updated last month
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆190Updated 3 years ago
- Pinterest API for Python☆33Updated 7 years ago
- Web scraping Page Objects core library☆99Updated 2 months ago