WolfgangFahl / pdfindexerLinks
Index and search PDF files using Apache Lucene and PDF Box
☆44Updated last week
Alternatives and similar repositories for pdfindexer
Users that are interested in pdfindexer are comparing it to the libraries listed below
Sorting:
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 8 years ago
- Quick demos using the Toolkit☆96Updated 2 years ago
- A course on free/libre and open source software☆11Updated 2 weeks ago
- ☆38Updated 9 years ago
- An HTML to Asciidoc converter written in JavaScript☆23Updated 10 years ago
- Java port of TLSH (Trend Micro Locality Sensitive Hash)☆21Updated 4 years ago
- Fast in-memory graph structure, powering Gephi☆74Updated this week
- an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)☆54Updated 7 years ago
- Quickly analyze and explore email with advanced analytics and visualization.☆56Updated 4 years ago
- A curated list of Awesome Apache Solr links and resources.☆110Updated 4 years ago
- Fusion demo app searching open-source project data from the Apache Software Foundation☆43Updated 7 years ago
- Cytoscape 3 desktop version.☆17Updated 3 weeks ago
- JSONiq tutorial☆46Updated 3 months ago
- Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.☆34Updated 2 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆47Updated 3 years ago
- Advanced similarity and duplicate source code proof of concept for our research efforts.☆52Updated 3 years ago
- A Java library for working with Frictionless Data Data Packages.☆23Updated last month
- Cloudfier is a model-driven tool for rapid development of business applications☆22Updated last month
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆16Updated 10 years ago
- Core API for Silverpeas☆51Updated last week
- An Object Graph Mapping Library For Gremlin☆31Updated 7 years ago
- Demonstration of searching PDF document with Solr, Tika, and Tesseract☆32Updated last year
- 📘 A Citation Style Language (CSL) processor for Java.☆98Updated last week
- A library for extracting tables from PDF files☆89Updated 12 years ago
- Textricator is a tool to extract text from documents and generate structured data.☆350Updated 7 months ago
- Telosys Code Generator - Eclipse Plugin☆61Updated 4 years ago
- A tool to generate UML class diagrams from JSON schema documents☆40Updated 5 years ago
- Wandora is a general purpose information extraction, management and publishing application based on Topic Maps and Java.☆133Updated 2 years ago
- Detect memory leaks in minutes without a heap dump.☆17Updated 8 years ago
- Dashboard composition tooling based on the Uberfire framework☆193Updated 2 years ago