Apache Tika Server as a Docker Image
☆173Jul 17, 2022Updated 3 years ago
Alternatives and similar repositories for docker-tikaserver
Users that are interested in docker-tikaserver are comparing it to the libraries listed below
Sorting:
- Docker container to provide Apache Tika RESTful API☆41Feb 12, 2016Updated 10 years ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Mar 13, 2019Updated 6 years ago
- Ideas for (tech) stuff to research, build or work on.☆49Jan 27, 2026Updated last month
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,649Updated this week
- Django SKOS-XL Thesaurus manager☆13Oct 18, 2021Updated 4 years ago
- This is a REST Server endpoint built using Flask and Python.☆24Nov 16, 2022Updated 3 years ago
- LDIF - Linked Data Integration Framework☆37Aug 2, 2016Updated 9 years ago
- Some ideas on making Bags into Git repositories☆16Dec 23, 2014Updated 11 years ago
- SKOS Support for Apache Lucene and Solr☆56May 12, 2021Updated 4 years ago
- Price options by fitting a Lévy distribution☆10Jan 20, 2021Updated 5 years ago
- A DropWizard wrapper around Apache Tika.☆10Dec 22, 2016Updated 9 years ago
- A project aiming "to significantly advance the state of the art with regard to indexing and querying biomedical data with freely availabl…☆79Feb 17, 2026Updated 2 weeks ago
- Text mining on the Royal Library newspaper corpus☆11Dec 3, 2025Updated 3 months ago
- General Architecture for Text Engineering☆49Mar 23, 2016Updated 9 years ago
- A Nutch 2.2.1 plugin which allows users to shuffle off the responsibility for retrieving pages to a selenium hub/node spoke system. This …☆16Jun 9, 2016Updated 9 years ago
- Core libraries by the PRImA Research Lab☆16Jul 30, 2024Updated last year
- Extract Data from Wikipedia Lists☆31Aug 27, 2017Updated 8 years ago
- A JRuby command line application and library for Apache Tika to extract text and metadata from files of various formats.☆54May 1, 2025Updated 10 months ago
- PHP client library for communicating with GetEventStore.☆12Mar 7, 2016Updated 10 years ago
- Former Official Gridcoin wiki (deprecated)☆11Feb 10, 2021Updated 5 years ago
- In the Django Authentication package is that all users use the same model/profile. This can be a drawback if you have lots of users or yo…☆25Feb 6, 2016Updated 10 years ago
- SKOS analysis for Elasticsearch☆54Jun 15, 2016Updated 9 years ago
- Express middleware for querying our graphql server built with graph.ql☆13Apr 29, 2018Updated 7 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆17Sep 11, 2015Updated 10 years ago
- Wrap Saxon 9.X's HE's XSLT 2.0 processor for use in JRuby☆16Sep 2, 2019Updated 6 years ago
- neonion is a user-centered collaborative semantic annotation webapp developed at the Human-Centered Computing group at Freie Universität …☆70Feb 13, 2019Updated 7 years ago
- You already have a beautiful HATEOAS API. You just don't know it yet.☆40Jun 19, 2015Updated 10 years ago
- Allow anyone with a modern browser to stream a 1GB, 10GB, 100GB, or 1TB file over the Internet and into a happy home.☆32Oct 7, 2018Updated 7 years ago
- Miscellaneous code snippets that I want to have versioned.☆17Mar 5, 2023Updated 3 years ago
- Using Centroids of Word Embeddings and Word Mover's Distance for Biomedical Document Retrieval in Question Answering.☆14Jul 13, 2017Updated 8 years ago
- A basic concept using Nuxt and Python to decouple template engines in Python web applications.☆16Jun 10, 2018Updated 7 years ago
- The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).☆3,595Updated this week
- Sudoku game built with angular.js☆46Aug 1, 2016Updated 9 years ago
- EEA ElasticSearch RDF River Plugin☆64Dec 14, 2021Updated 4 years ago
- Docker container for Cantaloupe IIIF server☆21Mar 11, 2024Updated last year
- Archive of political ad data from the Federal Communications Commission☆20Oct 25, 2017Updated 8 years ago
- Turns legal citations in the DOM into links☆20Mar 15, 2017Updated 8 years ago
- This is a list of various datasets that are collected by States initially and then provided to federal agencies.☆20Dec 17, 2021Updated 4 years ago
- python library for working with IIIF Image and Presentation APIs☆20Oct 28, 2025Updated 4 months ago