jvanasco / metadata_parser
python library for getting metadata
☆143Updated last week
Alternatives and similar repositories for metadata_parser:
Users that are interested in metadata_parser are comparing it to the libraries listed below
- A Python library for extracting titles, images, descriptions and canonical urls from HTML.☆149Updated 4 years ago
- Simple, robust email validation☆130Updated 2 years ago
- Web scraping Page Objects core library☆98Updated last month
- A Python library for finding feed links on websites.☆52Updated 2 years ago
- A python module to parse the Open Graph Protocol☆231Updated 3 years ago
- URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.☆255Updated last year
- Extract text from HTML☆134Updated 4 years ago
- A generic crawler☆78Updated 6 years ago
- extract difference between two html pages☆32Updated 6 years ago
- Extracts OpenGraph, TwitterCard and Schema properties from a webpage.☆83Updated 10 months ago
- Modern robots.txt Parser for Python☆192Updated last year
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 3 years ago
- A client interface for Scrapinghub's API☆205Updated last month
- ⛏ a library for scraping unreliable pages☆210Updated 7 months ago
- A lightweight customisable RSS reader for Django.☆171Updated 2 years ago
- Python client library for Postmark API☆141Updated last year
- Page Object pattern for Scrapy☆120Updated last month
- URL Transformation, Sanitization☆103Updated last year
- A Scrapy extension to log items coverage when the spider shuts down☆19Updated 4 years ago
- RSS Aggregator☆91Updated 3 years ago
- Automatic unit test generation for Scrapy.☆56Updated 3 years ago
- A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them☆65Updated 2 years ago
- a small library for extracting rich content from urls☆646Updated 3 months ago
- Have you ever wanted multiple views to match to the same URL? Now you can.☆274Updated last year
- Run a Scrapy spider programmatically from a script or a Celery task - no project required.☆122Updated 9 months ago
- URL normalization for Python☆94Updated 2 years ago
- Clickable label widget for django-taggit☆69Updated last year
- A database backed job scheduler for Django RQ and RQ Scheduler☆42Updated 2 years ago
- Twitter text processing library (auto linking and extraction of usernames, lists and hashtags).☆178Updated 3 months ago
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆56Updated 3 years ago