LogicalSpark / docker-tikaserver
Apache Tika Server as a Docker Image
☆172Updated 2 years ago
Alternatives and similar repositories for docker-tikaserver:
Users that are interested in docker-tikaserver are comparing it to the libraries listed below
- A bundle of useful Elasticsearch plugins☆110Updated last year
- FacetView is a pure javascript frontend for ElasticSearch.☆290Updated 9 years ago
- spaCy REST API, wrapped in a Docker container.☆267Updated 2 years ago
- Elasticsearch/Solr Sandbox for exploring explain information and tweaking☆137Updated last year
- Github mirror of "search/highlighter" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access…☆103Updated 2 months ago
- An Elasticsearch ingest processor to do named entity extraction using Apache OpenNLP☆271Updated 2 years ago
- "Stop worrying about Elasticsearch analyzers", my therapist says☆154Updated 3 years ago
- Bulk indexing command line tool for elasticsearch.☆280Updated last month
- A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector☆252Updated 7 years ago
- a pure javascript frontend for ElasticSearch search indices.☆78Updated 7 years ago
- Entity resolution for Elasticsearch.☆159Updated 3 months ago
- Index URLs in Common Crawl☆194Updated 7 years ago
- Extract postal addresses from the DOM☆66Updated 12 years ago
- Mapper Attachments Type plugin for Elasticsearch☆504Updated last year
- Dockerfile to run unoconv as a webservice☆96Updated 2 years ago
- A URL tokenizer and token filter plugin for Elasticsearch☆63Updated 3 years ago
- Launch AWS Elastic MapReduce jobs that process Common Crawl data.☆49Updated 8 years ago
- Naive Bayes Classifier implemented with Elasticsearch Aggregations☆51Updated 11 years ago
- SOLR bulk indexing utility for the command line.☆45Updated 3 weeks ago
- SKOS analysis for Elasticsearch☆54Updated 8 years ago
- Decompounding Plugin for Elasticsearch☆87Updated 4 years ago
- LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance …☆82Updated 7 years ago
- Carrot2 plugin for ElasticSearch☆291Updated 2 years ago
- Curated synonym files and Helpers for Elasticsearch Synonym Token Filter☆64Updated last year
- A collection of elasticsearch command line tools for doing things like bulk importing/exporting and exporting/importing mappings.☆190Updated 2 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 7 years ago
- A Docker build for Solr, to manage the official Docker hub solr image☆445Updated 2 years ago
- A scrapy pipeline which send items to Elastic Search server☆328Updated 2 years ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- Run pdf2htmlEX in a Docker container.☆25Updated last year