Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.
☆35Feb 21, 2026Updated last month
Alternatives and similar repositories for importer
Users that are interested in importer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆200Apr 3, 2026Updated last week
- Implementation of Norconex Committer for Elasticsearch.☆11Jan 4, 2022Updated 4 years ago
- Generic library shared between several projects.☆14Feb 23, 2026Updated last month
- Advanced fold methods for Kotlin☆12Apr 1, 2026Updated last week
- DistributeCrawler的Maven版☆10Jun 20, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A new, clean and lean network interface reachability library written in Swift.☆12Jun 6, 2023Updated 2 years ago
- pylambertw - sklearn interface to analyze and gaussianize heavy-tailed, skewed data☆16Apr 15, 2024Updated last year
- FFmpegKit implementation using Kotlin Multiplatform☆14Apr 4, 2023Updated 3 years ago
- Code samples for the Speedment ORM☆13Jun 21, 2022Updated 3 years ago
- Extremely minimal Compose Multiplatform sample that demonstrates use of on-device AI on iOS and Android.☆47Mar 1, 2026Updated last month
- A free multithreaded proxy checking program written in Java. Load a proxy list and check each proxy to verify it's alive to create a new …☆11Nov 5, 2015Updated 10 years ago
- search topics of sina weibo by phantomjs☆12Dec 20, 2015Updated 10 years ago
- Lua-MapReduce framework implemented in Lua using luamongo driver and MongoDB as storage. It follows Iterative MapReduce for training of M…☆25Dec 23, 2015Updated 10 years ago
- A library of examples showing how to use the Common Crawl corpus (2008-2012, ARC format)☆65Aug 5, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- 利用tushare pandas下载股票历史数据并存入mysql数据库☆13Dec 18, 2018Updated 7 years ago
- A new framework to generate interpretable classification rules☆18Feb 11, 2023Updated 3 years ago
- Java client for EventStore (http://geteventstore.com)☆20May 25, 2015Updated 10 years ago
- 舆情项目处理层 分词 情感分析☆10Mar 22, 2016Updated 10 years ago
- 基于spring boot的 监控平台☆11Jun 17, 2015Updated 10 years ago
- Kairos, combines a focused crawler and an information extraction engine, to convert a list of conference websites into a index filled wit…☆18Feb 20, 2011Updated 15 years ago
- Automatic CAPTCHA decoding☆11Apr 17, 2012Updated 13 years ago
- A consolidated application platform for Kotlin Multiplatform.☆20Jan 26, 2026Updated 2 months ago
- Windows Live API binding and connect support.☆18Dec 1, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 基于搜索引擎实现网盘搜索☆12Nov 15, 2018Updated 7 years ago
- 食品安全舆情分析系统(前端展示 模块)☆15May 21, 2015Updated 10 years ago
- Implementing java based text extractors as web APIs (currently only Boilerpipe & Goose)☆16Apr 1, 2012Updated 14 years ago
- sync tushare data automatically☆15Mar 2, 2017Updated 9 years ago
- alias names for java types☆15Apr 1, 2026Updated last week
- Roostrap is a proven rapid application framework compilation built by putting together Spring Roo, Twitter Bootstrap and Google AppEngine…☆35Dec 5, 2014Updated 11 years ago
- JBake Maven Plugin - NOTE: Code now resides in main JBake repository - https://github.com/jbake-org/jbake☆10Dec 28, 2021Updated 4 years ago
- Blog crawler for the blogforever project.☆23Jan 31, 2014Updated 12 years ago
- A lightweight KSP annotation processor that generates reports to track technical debt in Kotlin projects☆51Feb 8, 2026Updated 2 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Template for README.md☆66Feb 4, 2013Updated 13 years ago
- UI Components for Solr☆11Apr 24, 2018Updated 7 years ago
- 一个根据搜狗微信进行微信公众号采集的程序☆16Nov 12, 2015Updated 10 years ago
- ☆17May 25, 2015Updated 10 years ago
- Executable UML tools (xml schema, java model compiler, java + javascript model viewer) based on miUML metamodels☆20Sep 18, 2024Updated last year
- 一个通用的POI导出辅助工具库☆20Aug 28, 2017Updated 8 years ago
- MyBatis code generator☆27Jan 17, 2017Updated 9 years ago