Norconex / importerLinks
Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.
☆33Updated last month
Alternatives and similar repositories for importer
Users that are interested in importer are comparing it to the libraries listed below
Sorting:
- Annotated Excel parsing library to simplify parsing excel sheet in JAVA☆89Updated last year
- Roostrap is a proven rapid application framework compilation built by putting together Spring Roo, Twitter Bootstrap and Google AppEngine…☆35Updated 10 years ago
- Please use the luke bundled with lucene! This repo is archived and frozen now.☆101Updated 6 years ago
- Metl is a simple, web-based integration platform that allows for several different styles of data integration including messaging, file b…☆209Updated last week
- Java2word is a Library to generate MS Word Documents from Java code without any special components.☆96Updated 3 years ago
- Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV☆80Updated 2 years ago
- Java EE Cache Filter☆36Updated 6 years ago
- Apache POI builder☆54Updated 2 years ago
- ElasticSearch Java API tutorial using test cases.☆112Updated 3 years ago
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆194Updated this week
- Java library to use xml-rpc functionality of Wordpress☆78Updated 4 years ago
- Brix CMS☆126Updated last year
- Scriptella is an open source ETL (Extract-Transform-Load) and script execution tool written in Java. Note: The project is no longer under…☆108Updated 3 months ago
- An Java Backend for jQuery-QueryBuilder☆62Updated 6 years ago
- Office 365 client for Java☆50Updated 2 years ago
- CMS to create open social surveys☆60Updated 4 years ago
- JODConverter automates document conversions using LibreOffice/OpenOffice.org☆35Updated 8 years ago
- JD eSurvey is an open source enterprise survey web application written in Java and based on the Spring Framework. Check out the tutorial …☆231Updated 4 years ago
- Adds line-breaking, page-breaking, tables, and styles to PDFBox☆47Updated 2 years ago
- Java OCR allows you to perform OCR and bar code recognition on images (JPEG, PNG, TIFF, PDF, etc.) and output as plain text, xml with ful…☆136Updated 10 years ago
- Single file examples and ready-to-use servers show how to use parallec.io library. Examples to aggregate APIs and publish to Elastic Sear…☆92Updated 8 years ago
- Powerful, hierachical based desktop search engine based on swing and lucene.☆18Updated 8 years ago
- Shiro webapp using the buji-pac4j bridge and the javaee-pac4j security library☆84Updated last week
- Pivot4J provides a common API for OLAP servers which can be used to build an analytical service frontend with pivot style GUI.☆130Updated 3 years ago
- Converts XHTML to OpenXML WordML (docx) using docx4j☆145Updated last month
- Small set of tools allowing you to create secure encrypted tokens, which can be later exchanged with 3rd party systems or stored as a lic…☆82Updated 10 years ago
- Implementation of the new headless chrome with chromedriver and selenium.☆38Updated 6 years ago
- Java Quartz monitoring app.☆30Updated last year
- Sample Spring MVC Application demonstrating usage of Spring Data Solr.☆95Updated last year
- Provides a capability to tail log files in a web browser, implemented using websockets.☆63Updated 10 years ago