dfm / feedfinder2
A Python library for finding feed links on websites.
☆52Updated 2 years ago
Alternatives and similar repositories for feedfinder2:
Users that are interested in feedfinder2 are comparing it to the libraries listed below
- Paginating the web☆37Updated 11 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Modularly extensible semantic metadata validator☆83Updated 9 years ago
- URL Transformation, Sanitization☆103Updated last year
- Utility library to turn country names into ISO two-letter codes☆66Updated last month
- Python 3 AsyncIO powered scraping framework with batteries included☆20Updated 8 years ago
- Detect and classify pagination links☆15Updated 4 years ago
- Python implementation of the Parsley language for extracting structured data from web pages☆92Updated 7 years ago
- A Python library for extracting titles, images, descriptions and canonical urls from HTML.☆148Updated 4 years ago
- Faster replacement for Python's urlparse module☆46Updated 6 years ago
- feedparser but faster and worse☆103Updated 3 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 3 years ago
- 🌆 TouristFriend API lets you query Google Places, Yelp and Foursquare at the same time, with Bayesian rankings!☆29Updated 6 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆148Updated 2 months ago
- Restrict crawl and scraping scope using matchers.☆25Updated 8 years ago
- Library to populate items using XPath and CSS with a convenient API☆47Updated 2 weeks ago
- Find elements in HTML by matching them with a skeleton☆25Updated 2 years ago
- Perform lexical analysis on words, one word at a time.☆64Updated 6 years ago
- python library for getting metadata☆143Updated 2 months ago
- Analyze scraped data☆46Updated 5 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41Updated 7 years ago
- Scrapy spider middleware to clean up query parameters in request URLs☆25Updated 8 years ago
- Aviation grade news article metadata extraction☆36Updated last year
- A fork of http://pydispatcher.sourceforge.net/ with PyPy support☆16Updated 7 years ago
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 3 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- ☆18Updated 8 years ago
- python library for extracting html microdata☆166Updated last year
- Modern robots.txt Parser for Python☆192Updated last year