medialab / uralLinks
A helper library full of URL-related heuristics.
☆69Updated 2 weeks ago
Alternatives and similar repositories for ural
Users that are interested in ural are comparing it to the libraries listed below
Sorting:
- Now included in rigour☆151Updated last month
- Extract networks of entities from journalistic reporting☆48Updated last year
- Web scraping Page Objects core library☆101Updated 2 weeks ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆23Updated last year
- Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.☆64Updated last week
- Specialized & performant CSV readers, writers and enrichers for python.☆13Updated last year
- A Python library for defining rule-based overrides on messy data☆16Updated 2 months ago
- Extract text from HTML☆134Updated 4 years ago
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.☆19Updated 11 months ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆130Updated 5 months ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 8 months ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- Alternative robots parser module for Python☆18Updated this week
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 5 years ago
- Web interface for network analysis.☆21Updated 2 years ago
- Utility library to turn country names into ISO two-letter codes☆69Updated last week
- A maximum-strength name parser for record linkage.☆37Updated last week
- scraper for facebook, gab, google and tiktok☆21Updated this week
- A webmining CLI tool & library for python.☆326Updated last week
- Python IMage MIning☆14Updated 3 months ago
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆13Updated 3 months ago
- Python based Wikidata framework for easy dataframe extraction☆44Updated last year
- Trying to generate name synonyms from wikidata☆32Updated 4 years ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 4 years ago
- How can we improve name matching in screening tools?☆13Updated 4 months ago
- Extract dates from text☆64Updated 4 years ago
- Page Object pattern for Scrapy☆123Updated 3 weeks ago
- The most advanced debugging and testing tool for Scrapy☆16Updated 2 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 6 months ago