apache / tika-dockerLinks
Convenience Docker images for Apache Tika Server
☆209Updated 2 weeks ago
Alternatives and similar repositories for tika-docker
Users that are interested in tika-docker are comparing it to the libraries listed below
Sorting:
- Apache Tika Server as a Docker Image☆172Updated 3 years ago
- Docker files for a dockerized unoserver☆73Updated last week
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆327Updated last year
- ☆194Updated 3 weeks ago
- A lightweight version of Milvus☆373Updated 2 weeks ago
- A small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utilizi…☆120Updated this week
- 📚 Process PDFs, Word documents and more with spaCy☆761Updated 6 months ago
- Official Elastic connectors for third-party data sources☆117Updated last week
- Official Dockerfile for Apache Solr☆29Updated 2 weeks ago
- Extract structured text from pdfs quickly☆605Updated 3 months ago
- The Portainer agent☆368Updated last week
- A basic tool that extracts the structure from the PDF files of scientific articles.☆75Updated 3 years ago
- Simplify DOCX files to JSON☆251Updated last year
- Headless LibreOffice in Docker listening for API requests.☆64Updated last year
- ☆179Updated this week
- Docker Images for the Neo4j Graph Database☆363Updated this week
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆375Updated last month
- Benchmarking PDF libraries☆312Updated 3 months ago
- Elasticsearch File System Crawler (FS Crawler)☆1,411Updated this week
- Convert Word documents to beautiful Markdown. Via command line or in your browser.☆138Updated last week
- PDF to XML ALTO file converter☆254Updated 3 weeks ago
- Source for the official Caddy v2 Docker Image☆509Updated last month
- Pipeline for converting PDFs to raw text with PaddleOCR☆23Updated 2 years ago
- A python library to define and validate data types in Docling.☆185Updated this week
- Weaviate Web UI☆73Updated 2 years ago
- Towards an open source stack for e-commerce search☆150Updated 6 months ago
- 🦦 weasel: A small and easy workflow system☆87Updated last year
- Easily deploy Haystack pipelines as REST APIs and MCP Tools.☆116Updated this week
- Free and Open Source Plugin that adds enterprise features to Neo4j Community Distributions☆117Updated 3 months ago
- This project aims to extract text from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging t…☆34Updated 8 months ago