Riamse / ceterachLinks
An interface for interacting with MediaWiki
☆37Updated 3 years ago
Alternatives and similar repositories for ceterach
Users that are interested in ceterach are comparing it to the libraries listed below
Sorting:
- A set of utilities for accessing and processing MediaWiki data.☆55Updated 6 years ago
- Semanticizest: dump parser and client☆20Updated 9 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 8 years ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Wikipedia citation tool for Google Books, New York Times, ISBN, DOI and more☆22Updated 8 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 8 years ago
- CLI tool for importing entities from Wikidata / Wikibase☆23Updated 2 years ago
- Code for aggregating wikipedia traffic statistics☆36Updated 12 years ago
- Aviation grade news article metadata extraction☆36Updated 2 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- Command-line tool to extract a ranked list of relevant keywords from a corpus with the option of using either topic modeling or tf-idf sc…☆40Updated 8 years ago
- Github mirror of "analytics/quarry/web" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_acce…☆43Updated 2 years ago
- a gauge widget to display wikipedia activity☆41Updated 7 years ago
- Exploring power and influence in the European Union by combining information from a variety of official EU data sources related to lobbyi…☆37Updated 9 years ago
- A javascript tool to visualize the diff's in wikipedia☆35Updated 2 years ago
- This a module to extract RDF from an HTML5 page annotated with microdata. The module implements the algorithm defined and published by th…☆44Updated 3 years ago
- Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.☆65Updated last week
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 9 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Updated 8 years ago
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆49Updated 3 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 4 years ago
- Specialised bot for periodical grabs and video/audio/etc. webpage scrapes.☆11Updated 7 years ago
- Trying to generate name synonyms from wikidata☆32Updated 5 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated last year
- Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery☆57Updated 11 months ago
- Tools for tracking stories on news homepages☆48Updated 5 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year