ivbeg / newsworkerLinks
Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds
☆80Updated 2 years ago
Alternatives and similar repositories for newsworker
Users that are interested in newsworker are comparing it to the libraries listed below
Sorting:
- Универсальный парсер деклараций в формат для передачи в Декларатор.☆18Updated 7 months ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 7 months ago
- Lazy helper tool to make easier scraping with simple tasks☆18Updated 2 years ago
- Parses Firefox/Chrome HTML bookmarks files☆49Updated last year
- Aiohttp web server API, which scrapes Google and returns scrape results as response. Supports proxies, multiple geos and number of result…☆56Updated last year
- Project on text topics evolution over time analysis☆81Updated 2 years ago
- Extract text from HTML☆135Updated 4 years ago
- Scrape VK media☆57Updated last year
- Bot for forwarding updates from RSS/Atom feeds to Telegram☆57Updated 2 weeks ago
- Read It Later for Telegram☆83Updated 7 years ago
- Proxy collector☆150Updated 2 years ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆191Updated 3 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- A helper library full of URL-related heuristics.☆69Updated 2 months ago
- Scrapy middleware which allows to crawl only new content☆79Updated 2 years ago
- Quick and dirty date parsing Python library to parse HTML dates really fast☆20Updated last year
- Russian names parsers, gender identification and processing tools☆129Updated last year
- Comparing quality and performance of NLP systems for Russian language☆49Updated last year
- Web scraping Page Objects core library☆101Updated last week
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆56Updated last year
- Extracts tables from .docx files and saves them as .csv or .xls files☆63Updated last year
- Python wrapper for Ferret☆41Updated 3 years ago
- Repository for ru-syntax command line tool.☆16Updated 3 years ago
- Russian Text Expansion based on ruGPT3Large☆25Updated 3 years ago
- API - extract a list of keywords from a text.☆18Updated 7 years ago
- Pulls multiple podcast feeds (RSS) and republishes as a common feed, properly sorted and podcast-client friendly.☆126Updated last month
- Telegram bot forwarding messages to the inbox☆140Updated this week
- Term extraction for Russian language☆89Updated 6 years ago
- A professional-grade text randomizer and ad generator by Airat Halitov — perfect for creating unique, human-readable content at scale.☆23Updated 2 months ago
- Python client for Yandex.XML☆19Updated 2 years ago