ivbeg / newsworkerLinks
Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds
☆80Updated 2 years ago
Alternatives and similar repositories for newsworker
Users that are interested in newsworker are comparing it to the libraries listed below
Sorting:
- Aiohttp web server API, which scrapes Google and returns scrape results as response. Supports proxies, multiple geos and number of result…☆57Updated last year
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- Extract text from HTML☆134Updated 5 years ago
- Parses Firefox/Chrome HTML bookmarks files☆50Updated last year
- ☆62Updated last year
- API - extract a list of keywords from a text.☆18Updated 8 years ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆191Updated 3 years ago
- Lightweight library that converts a HTML webpage to JSON data using a template defined in JSON.☆23Updated 2 months ago
- Document Search Engine Tool☆73Updated 2 years ago
- Firefox and Chrome compatible extension that acts as annotation tool for websites (Named Entity Recognition)☆10Updated 6 years ago
- Python client for Yandex.XML☆19Updated 2 years ago
- SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type …☆266Updated 3 years ago
- Tools to easy generate RSS feed that contains each scraped item using Scrapy framework.☆33Updated last month
- Console program to get global ranking for a given website or domain☆21Updated 2 months ago
- DuckDuckGo search engine API library for Python☆41Updated 5 years ago
- Lazy helper tool to make easier scraping with simple tasks☆19Updated 2 years ago
- This repository provides usage examples for the Python module Newspaper3k.☆147Updated last year
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 9 months ago
- Poetry tools and russian text parser☆8Updated 8 years ago
- Russian names parsers, gender identification and processing tools☆131Updated last year
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- ☆63Updated last year
- A helper library full of URL-related heuristics.☆70Updated 2 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Python library for scraping google search results☆115Updated 8 months ago
- A curated list of awesome twitter tools☆224Updated last year
- Простая обертка на языке Python для яндексового Tomita Parser'а (больше не нужна, Яндекс открыл исходники)☆17Updated 9 years ago
- Read It Later for Telegram☆83Updated 7 years ago
- Crawl sites for RSS, Atom, and JSON feeds.☆77Updated last year