gambolputty / newscorpus
A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.
☆19Updated 10 months ago
Alternatives and similar repositories for newscorpus:
Users that are interested in newscorpus are comparing it to the libraries listed below
- Extract networks of entities from journalistic reporting☆48Updated last year
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆23Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆38Updated 3 years ago
- Named-Entity Recognition extension for OpenRefine☆28Updated 2 years ago
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆24Updated last year
- A deep learning model for extracting references from text☆28Updated last year
- ☆25Updated 2 years ago
- A collaborative collection of datasets that are common to use within "Follow the Money" investigations with european scope☆13Updated 11 months ago
- Python package to reconcile DataFrames☆24Updated 2 years ago
- A deep learning architecture for reference mining from literature in the arts and humanities.☆16Updated 5 years ago
- OpenRefine Reconciliation Framework in Python and Flask☆21Updated 2 years ago
- ☆46Updated 8 months ago
- ☆32Updated 2 years ago
- A Mashup Interface for Text Analysis Operations☆13Updated 4 months ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆16Updated last month
- Process, enhance and evaluate multiple OCR output.☆22Updated 6 months ago
- VIAF via Python☆12Updated last year
- Trying to generate name synonyms from wikidata☆32Updated 4 years ago
- Heritage Connector: Transforming text into data to extract meaning and make connections☆24Updated 2 years ago
- Neo4j powered web application for multimedia collections: bring graph-based exploration and crowd-based indexation.☆39Updated 5 years ago
- Collection de romans français du dix-huitième siècle (1751-1800) / Collection of Eighteenth-Century French Novels (1751-1800)☆22Updated last year
- Adding links to full text in Wikipedia references☆37Updated last year
- DBpedia, which frequently crawls and analyses over 120 Wikipedia language editions has near complete information about (1) which facts ar…☆11Updated 2 years ago
- OpenRefine reconciler for Research Organization Registry☆13Updated last month
- 🔎 Finds fuzzy matches between datasets☆12Updated 3 months ago
- A LevelDB backed URL unshortening microservice written in JavaScript☆31Updated 2 years ago
- A reconciliation service for OpenRefine serving data from a given CSV file.☆77Updated 2 months ago
- Web interface for network analysis.☆21Updated 2 years ago
- WordWanderer – take your text for a walk☆12Updated 5 years ago
- Citation Classification using hybrid neural network model for Wikipedia References☆28Updated 2 years ago