rcarmo / newsfeed-corpusLinks
A Dockerized RSS feed fetcher for NLP work, using asyncio
☆20Updated 2 years ago
Alternatives and similar repositories for newsfeed-corpus
Users that are interested in newsfeed-corpus are comparing it to the libraries listed below
Sorting:
- a simple interface from extracting texts from (almost) any url☆52Updated 5 years ago
- Data validation as a service. Project retired, got to the current one at frictionsless/repository☆69Updated 2 years ago
- Personal Knowledge Management System. Capture your ideas using plain old text files. Make a journal that lasts 100 years.☆29Updated last year
- Tag-based bookmark manager inspired by delicious and Pinboard☆34Updated 2 years ago
- Primary LocalWiki backend server environment☆48Updated 7 years ago
- An interface for interacting with MediaWiki☆37Updated 3 years ago
- Add website scraping abilities to Datasette☆63Updated 2 years ago
- Aviation grade news article metadata extraction☆36Updated 2 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆40Updated 8 years ago
- remoteStorage-enabled bookmarking app☆77Updated 2 years ago
- An eBook tool to extract ISBN or Metadata form eBook and rename them by using ISBN database and Metadata☆30Updated 9 years ago
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 7 years ago
- unformatted text > parse/clean it > get relevant info☆52Updated 6 years ago
- Saveto. Quick for save link, collections, notes, snipping, ...☆47Updated last month
- Create and deploy a RESTful API with a few lines of YAML☆32Updated 6 years ago
- Now included in rigour☆151Updated last month
- Save data from Google Takeout to a SQLite database☆109Updated last year
- A Python utility for moving bookmarks/reading lists between services☆204Updated 9 years ago
- Presentations on Quantified Self and Self-Tracking with Python☆30Updated 2 years ago
- Script that fetches emails from Gmail and converts them to Trello cards.☆9Updated 6 years ago
- Utility library to turn country names into ISO two-letter codes☆69Updated 2 weeks ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- ☆15Updated 6 years ago
- Data Pipes for CSV☆116Updated 2 years ago
- Bookmark and archive webpages from the command line☆33Updated 6 years ago
- Python code to scrape and collect data from the RSS feeds Facebook uses to augment its Trending Section☆57Updated 6 years ago
- Create a SQLite database containing data from your Pocket account☆105Updated last year
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 5 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 7 months ago