dossier / html-highlighterLinks
Highlight and select phrases in HTML pages.
☆24Updated 6 years ago
Alternatives and similar repositories for html-highlighter
Users that are interested in html-highlighter are comparing it to the libraries listed below
Sorting:
- Index URLs in Common Crawl☆198Updated 8 years ago
- FacetView is a pure javascript frontend for ElasticSearch.☆291Updated 10 years ago
- ☆44Updated 10 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆73Updated 8 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 9 years ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- A tutorial about DBpedia and Linked Data in general☆23Updated 11 years ago
- Browser add-on and web server to support collection and analysis of web browsing data.☆14Updated 9 years ago
- mltk - Moz Language Tool Kit☆12Updated 10 years ago
- A cross-platform command line tool for parallelised content extraction and analysis.☆249Updated 2 months ago
- SKOS analysis for Elasticsearch☆54Updated 9 years ago
- An attempt at creating a silver/gold standard dataset for backtesting yesterday & today's content-extractors☆35Updated 10 years ago
- Semanticizest: dump parser and client☆20Updated 9 years ago
- NEWS: JATE2.0 Beta.11 Released, see details below.☆84Updated 2 years ago
- command-line tool to extract taxonomies from Wikidata☆128Updated 6 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆64Updated 5 months ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆95Updated 7 years ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆93Updated 2 months ago
- General Architecture for Text Engineering☆49Updated 9 years ago
- Automatically extracts and normalizes an online article or blog post publication date☆117Updated 2 years ago
- ☆14Updated 4 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated this week
- displaCy-ent.js: An open-source named entity visualiser for the modern web☆199Updated 7 years ago
- Version 1.0 of the CrowdTruth Framework for crowdsourcing ground truth data, for training and evaluation of cognitive computing systems. …☆60Updated 7 years ago
- ☆18Updated 8 years ago
- Blog crawler for the blogforever project.☆23Updated 11 years ago
- Extract Data from Wikipedia Lists☆31Updated 8 years ago