Norconex / importer
Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.
☆34Updated 6 months ago
Alternatives and similar repositories for importer:
Users that are interested in importer are comparing it to the libraries listed below
- Provides simplified access to the ElasticSearch Java API.☆4Updated 4 years ago
- Java 11 Library with tons of utility classes required in all projects☆33Updated 2 weeks ago
- Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV☆72Updated last year
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆188Updated this week
- Please use the luke bundled with lucene! This repo is archived and frozen now.☆101Updated 6 years ago
- Neuro4j Workflow is a light-weight workflow engine for Java with Eclipse-based development environment. Workflow allows to build reusable…☆60Updated 6 years ago
- Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to netw…☆22Updated 7 months ago
- Fast Flexible Efficient In Memory Java LRU cache, As describe in☆31Updated 2 months ago
- Apache POI builder☆54Updated 2 years ago
- Sample audit4j applications.☆19Updated 6 years ago
- OddSource Code Java License Manager☆26Updated 6 years ago
- jORM is a Lightweight Java ORM☆37Updated 6 years ago
- QuartzDesk Executor (QE) is a scalable and generic job scheduling application that can be used to schedule execution of native shell scri…☆22Updated 4 months ago
- Annotated Excel parsing library to simplify parsing excel sheet in JAVA☆85Updated 10 months ago
- The SQL Processor is an engine producing the ANSI SQL statements and providing their execution without the necessity to write Java plumbi…☆27Updated last year
- Implementation of the new headless chrome with chromedriver and selenium.☆38Updated 6 years ago
- Tiny License Framework for Java☆67Updated 6 years ago
- YaHP is a Java library that allows you to convert an HTML document into a PDF document.☆56Updated 13 years ago
- Java object serialization and de-serialization processor☆32Updated 7 years ago
- reverse proxy implement in java☆22Updated 7 years ago
- Demo applications for Pippo (http://www.pippo.ro)☆26Updated 3 years ago
- Advanced distributed task distribution library for Hazelcast. Customizable task load balancing with failover. For example: Fair task e…☆44Updated 10 years ago
- Lucene Directory Storage on top of Redis☆29Updated 7 years ago
- Java JNA Wrapper for Leptonica Image Processing Library☆30Updated 2 months ago
- Maven 2 + JAX-WS + Spring + CXF +log4j Demo (web service and clients) just for playing around with SOAP web services. WS default endpoint…☆28Updated 2 years ago
- The simple, stupid job server for Java☆40Updated 4 years ago
- Clone of the Unitils SVN repository. Adds support for Java 8, HSqlDB and immutable collections☆23Updated 3 years ago
- Audit4j Spring Integration.☆18Updated 2 years ago
- Fork of sql2java, a trusty, old code generator☆16Updated 8 years ago
- ElasticSearch Java API tutorial using test cases.☆112Updated 3 years ago