apache / tika-dockerLinks
Convenience Docker images for Apache Tika Server
β225Updated 2 months ago
Alternatives and similar repositories for tika-docker
Users that are interested in tika-docker are comparing it to the libraries listed below
Sorting:
- Apache Tika Server as a Docker Imageβ172Updated 3 years ago
- π PDF text extraction pipeline: self-hosted, local-first, Docker-basedβ329Updated 2 years ago
- Docker files for a dockerized unoserverβ75Updated last week
- β852Updated 3 weeks ago
- Official Dockerfile for Apache Solrβ30Updated last month
- Convert file formats like docx, xlx to other formats like pdf, png - based on jodconverter and libreofficeβ95Updated 2 months ago
- π Process PDFs, Word documents and more with spaCyβ820Updated 9 months ago
- spaCy REST API, wrapped in a Docker container.β267Updated 2 years ago
- Python bindings to PDFium, reasonably cross-platform.β689Updated this week
- Docker Images for the Neo4j Graph Databaseβ369Updated this week
- Apache Tika Server with Tesseract 4 Docker Setupβ23Updated 4 years ago
- A small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utiliziβ¦β124Updated this week
- β184Updated this week
- Mattermost Agents plugin supporting multiple LLMsβ192Updated this week
- Open Source, Distributed, Big Data Enterprise Search Engineβ90Updated 3 weeks ago
- A tiny frontend for OCRing PDF files via the web.β51Updated 5 years ago
- Elasticsearch File System Crawler (FS Crawler)β1,417Updated last week
- Weaviate Web UIβ78Updated 2 years ago
- A simple Next.js frontend to explore your local weaviate collections and dataβ39Updated 5 months ago
- A vendor-neutral application gateway compatible with the WOPI specifications.β68Updated this week
- Towards an open source stack for e-commerce searchβ150Updated 2 months ago
- Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.β333Updated last year
- A Redis server with additional database capabilities powered by Redis modules.β225Updated last month
- INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.β670Updated this week
- Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.β390Updated last week
- Free and Open Source Plugin that adds enterprise features to Neo4j Community Distributionsβ127Updated 5 months ago
- A lightweight version of Milvusβ406Updated 3 weeks ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.β404Updated last year
- PDF to XML ALTO file converterβ257Updated last month
- An Elasticsearch ingest processor to do named entity extraction using Apache OpenNLPβ275Updated 3 years ago