apache / tika-dockerLinks
Convenience Docker images for Apache Tika Server
β204Updated last month
Alternatives and similar repositories for tika-docker
Users that are interested in tika-docker are comparing it to the libraries listed below
Sorting:
- Apache Tika Server as a Docker Imageβ172Updated 3 years ago
- π PDF text extraction pipeline: self-hosted, local-first, Docker-basedβ326Updated last year
- π Process PDFs, Word documents and more with spaCyβ706Updated 5 months ago
- β790Updated last month
- Python bindings to PDFium, reasonably cross-platform.β608Updated this week
- A Redis server with additional database capabilities powered by Redis modules.β216Updated last month
- Weaviate Web UIβ67Updated last year
- Docker files for a dockerized unoserverβ65Updated this week
- Benchmarking PDF librariesβ304Updated last month
- Apache Tika Server with Tesseract 4 Docker Setupβ23Updated 4 years ago
- Convert file formats like docx, xlx to other formats like pdf, png - based on jodconverter and libreofficeβ90Updated 3 months ago
- Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Texβ¦β1,067Updated 3 months ago
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDFβ1,009Updated 3 weeks ago
- Simplify DOCX files to JSONβ246Updated 10 months ago
- 𦦠weasel: A small and easy workflow systemβ85Updated last year
- Milvus Command Lineβ101Updated 3 months ago
- Software that makes labeling PDFs easy.β418Updated last year
- Mattermost Agents plugin supporting multiple LLMsβ178Updated this week
- Extract structured text from pdfs quicklyβ563Updated 2 months ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.β189Updated this week
- A small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utiliziβ¦β112Updated last week
- Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.β386Updated last week
- Source for the official Caddy v2 Docker Imageβ494Updated 2 months ago
- Dockerfile to run unoconv as a webserviceβ96Updated 2 years ago
- Entity resolution for Elasticsearch.β162Updated 7 months ago
- Easily deploy Haystack pipelines as REST APIs and MCP Tools.β104Updated this week
- Towards an open source stack for e-commerce searchβ149Updated 5 months ago
- A lightweight version of Milvusβ355Updated 2 weeks ago
- π Sycamore is an LLM-powered search and analytics platform for unstructured data.β554Updated last week
- Docker Images for the Neo4j Graph Databaseβ357Updated last week