ivbeg / newsworkerLinks
Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds
☆80Updated 3 years ago
Alternatives and similar repositories for newsworker
Users that are interested in newsworker are comparing it to the libraries listed below
Sorting:
- ☆62Updated last year
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated last year
- Extract text from HTML☆134Updated 5 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- Lightweight library that converts a HTML webpage to JSON data using a template defined in JSON.☆23Updated 4 months ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆189Updated 3 years ago
- A helper library full of URL-related heuristics.☆71Updated 2 weeks ago
- API - extract a list of keywords from a text.☆18Updated 8 years ago
- Parses Firefox/Chrome HTML bookmarks files☆49Updated last year
- Lazy helper tool to make easier scraping with simple tasks☆19Updated 2 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Simple framework for building Instagram chat bots with menu driven interface☆18Updated 5 years ago
- Python library to read, write and convert data files with formats BSON, JSON, NDJSON, Parquet, ORC, XLS, XLSX and XML☆16Updated 3 months ago
- Python wrapper for Ferret☆43Updated 3 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 6 years ago
- Architecture of Twint scrapper which allow download tweets on many instances without api restrictions☆10Updated 4 years ago
- Aiohttp web server API, which scrapes Google and returns scrape results as response. Supports proxies, multiple geos and number of result…☆59Updated last year
- Aggregates posts from the telegram channels assigned to a bot (admin), saves them into the MongoDB & renders the data in form of cards (R…☆14Updated 2 years ago
- Your Advanced Twitter stalking tool☆151Updated last year
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆98Updated 3 years ago
- Extracts tables from .docx files and saves them as .csv or .xls files☆64Updated 2 years ago
- Ultimate Website Sitemap Parser☆225Updated last month
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆24Updated 5 years ago
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- Matrix-based News Aggregation to Explore Media Bias☆19Updated 7 years ago
- Python library for scraping google search results☆115Updated 10 months ago
- A Python Package which helps to scrape all news details from any news websites☆219Updated 4 months ago
- A framework to manage, monitor and deploy marketing in social-media by re-posting content from one place to the another.☆36Updated 2 years ago
- Document Search Engine Tool☆74Updated 2 years ago
- Scrapes sites. Gets news. Eventually events.☆85Updated 9 years ago