apache / tika-dockerLinks
Convenience Docker images for Apache Tika Server
☆206Updated 2 weeks ago
Alternatives and similar repositories for tika-docker
Users that are interested in tika-docker are comparing it to the libraries listed below
Sorting:
- Python bindings to PDFium, reasonably cross-platform.☆629Updated this week
- ☆192Updated last week
- ☆802Updated last week
- Self-hosted web UI for Qdrant☆323Updated this week
- A lightweight version of Milvus☆364Updated last week
- A Redis server with additional database capabilities powered by Redis modules.☆214Updated 2 months ago
- Mattermost Agents plugin supporting multiple LLMs☆178Updated last week
- A python library to define and validate data types in Docling.☆173Updated this week
- Pipeline for converting PDFs to raw text with PaddleOCR☆23Updated 2 years ago
- Benchmarking PDF libraries☆309Updated 2 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆180Updated this week
- Simplify DOCX files to JSON☆250Updated 11 months ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆396Updated last year
- Weaviate Web UI☆70Updated last year
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆363Updated 3 weeks ago
- 🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.☆560Updated this week
- Knowledge Table is an open-source package designed to simplify extracting and exploring structured data from unstructured documents.☆604Updated 9 months ago
- Official Dockerfile for Apache Solr☆29Updated last month
- Extract structured text from pdfs quickly☆589Updated 2 months ago
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆38Updated 5 months ago
- ☆172Updated this week
- Lightweight, performant, deep table extraction☆504Updated last month
- ☆19Updated 6 months ago
- 🦦 weasel: A small and easy workflow system☆85Updated last year
- Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.☆310Updated last year
- Milvus Command Line☆105Updated 2 weeks ago
- Easily deploy Haystack pipelines as REST APIs and MCP Tools.☆109Updated this week
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cro…☆854Updated 2 months ago
- A proxy server for multiple ollama instances with Key security☆485Updated last month
- Simple UI for MinIO Object Storage☆1,068Updated 2 weeks ago