LogicalSpark / docker-tikaserver
Apache Tika Server as a Docker Image
☆171Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for docker-tikaserver
- FacetView is a pure javascript frontend for ElasticSearch.☆291Updated 9 years ago
- Bulk indexing command line tool for elasticsearch.☆280Updated last month
- A bundle of useful Elasticsearch plugins☆110Updated 6 months ago
- A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector☆251Updated 6 years ago
- Entity resolution for Elasticsearch.☆157Updated 3 months ago
- Github mirror of "search/highlighter" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access…☆100Updated 5 months ago
- "Stop worrying about Elasticsearch analyzers", my therapist says☆154Updated 3 years ago
- An Elasticsearch ingest processor to do named entity extraction using Apache OpenNLP☆269Updated 2 years ago
- a pure javascript frontend for ElasticSearch search indices.☆79Updated 6 years ago
- Mapper Attachments Type plugin for Elasticsearch☆504Updated last year
- Elasticsearch entity resolution plugin based on Duke☆210Updated 4 years ago
- Convenience Docker images for Apache Tika Server☆135Updated 2 weeks ago
- SOLR bulk indexing utility for the command line.☆45Updated 2 months ago
- ☆184Updated 5 years ago
- A python library detect and extract listing data from HTML page.☆109Updated 7 years ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 8 years ago
- Elasticsearch/Solr Sandbox for exploring explain information and tweaking☆135Updated 7 months ago
- A text tagger based on Lucene / Solr, using FST technology☆174Updated 10 months ago
- Launch AWS Elastic MapReduce jobs that process Common Crawl data.☆49Updated 7 years ago
- Index URLs in Common Crawl☆193Updated 7 years ago
- Java and REST APIs for working with time-representing tree in Neo4j☆207Updated 3 years ago
- Docker container to provide Apache Tika RESTful API☆40Updated 8 years ago
- Elasticsearch Index Termlist☆117Updated 5 years ago
- spaCy REST API, wrapped in a Docker container.☆265Updated last year
- Text classification using Naive Bayes and Elasticsearch☆154Updated 8 years ago
- Entity Extraction Text Processor☆148Updated last year
- Extract postal addresses from the DOM☆66Updated 12 years ago
- A platform for backing crowdsourcing websites, built in golang for elasticsearch☆362Updated 4 years ago
- Tesseract 4 OCR Runtime Environment - Docker Container☆97Updated 5 years ago
- An elasticsearch plugin to create hierarchical aggregations☆51Updated last month