WolfgangFahl / pdfindexerLinks
Index and search PDF files using Apache Lucene and PDF Box
☆44Updated 3 months ago
Alternatives and similar repositories for pdfindexer
Users that are interested in pdfindexer are comparing it to the libraries listed below
Sorting:
- ☆38Updated 9 years ago
- An HTML to Asciidoc converter written in JavaScript☆23Updated 10 years ago
- A course on free/libre and open source software☆11Updated last year
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆47Updated 3 years ago
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34Updated 2 years ago
- A curated list of Awesome Apache Solr links and resources.☆110Updated 4 years ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆273Updated 3 years ago
- Textricator is a tool to extract text from documents and generate structured data.☆350Updated 6 months ago
- A tool to generate UML class diagrams from JSON schema documents☆40Updated 5 years ago
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆194Updated last week
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- Preliminary Solr DQ / Data Quality experiments and prototype, and SolrJ wrapper utilities☆26Updated 8 months ago
- A library for extracting tables from PDF files☆89Updated 12 years ago
- Wandora is a general purpose information extraction, management and publishing application based on Topic Maps and Java.☆133Updated 2 years ago
- an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)☆54Updated 7 years ago
- Quickly analyze and explore email with advanced analytics and visualization.☆56Updated 4 years ago
- Fusion demo app searching open-source project data from the Apache Software Foundation☆43Updated 7 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆86Updated 5 years ago
- JSONiq tutorial☆46Updated 2 months ago
- Core API for Silverpeas☆50Updated this week
- scraper related helper functions☆27Updated 11 years ago
- Cytoscape 3 desktop version.☆17Updated last week
- TheMovieDB in Solr☆22Updated last year
- Installer for Thymeflow, a personal knowledge management system.☆34Updated 7 years ago
- Tool for visualizing hOCR output from Tesseract (or other OCR engines that support hOCR).☆24Updated 10 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- FreeQDA☆29Updated 5 years ago
- Fast in-memory graph structure, powering Gephi☆74Updated last week
- An interactive network analysis & visualization tool☆23Updated 6 years ago
- Quick demos using the Toolkit☆96Updated 2 years ago