apache / tika-dockerLinks
Convenience Docker images for Apache Tika Server
☆199Updated last week
Alternatives and similar repositories for tika-docker
Users that are interested in tika-docker are comparing it to the libraries listed below
Sorting:
- Running Docling as an API service☆542Updated this week
- ☆187Updated 2 weeks ago
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆323Updated last year
- ☆769Updated last week
- Python bindings to PDFium. Reasonably cross-platform.☆596Updated this week
- 📚 Process PDFs, Word documents and more with spaCy☆678Updated 4 months ago
- Self-hosted web UI for Qdrant☆297Updated this week
- A python library to define and validate data types in Docling.☆155Updated this week
- Official Dockerfile for Apache Solr☆28Updated 4 months ago
- ☆94Updated this week
- Weaviate Web UI☆65Updated last year
- Extract structured text from pdfs quickly☆512Updated last month
- Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Tex…☆1,053Updated 3 months ago
- Docker Images for the Neo4j Graph Database☆355Updated last week
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆38Updated 4 months ago
- Scrape documentation into Meilisearch☆320Updated 6 months ago
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF☆980Updated 2 weeks ago
- Dockerfile to run unoconv as a webservice☆96Updated 2 years ago
- Benchmarking PDF libraries☆296Updated 2 weeks ago
- Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.☆383Updated this week
- ☆151Updated this week
- Visualize Different Text Splitting Methods☆272Updated 6 months ago
- ☆19Updated 5 months ago
- Easily deploy Haystack pipelines as REST APIs and MCP Tools.☆93Updated last week
- Milvus Command Line☆99Updated 2 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆17Updated 9 months ago
- Search preview for Meilisearch☆261Updated 2 weeks ago
- Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.☆307Updated last year
- A proxy server for multiple ollama instances with Key security☆462Updated last week
- A Docker build for Solr, to manage the official Docker hub solr image☆446Updated 2 years ago