Norconex / collector-filesystemLinks
Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
☆22Updated 8 months ago
Alternatives and similar repositories for collector-filesystem
Users that are interested in collector-filesystem are comparing it to the libraries listed below
Sorting:
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆45Updated 3 years ago
- Open Source, Distributed, Big Data Enterprise Search Engine☆69Updated this week
- A library to store metadata of relational databases including the schema, statistics, and integrity constraints.☆25Updated 6 years ago
- A PDFBox fork intended to be used as PDF processor for Sejda and PDFsam☆50Updated last week
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 7 years ago
- Core API for Silverpeas☆50Updated this week
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆188Updated this week
- Web/FileSystem Crawler Library☆29Updated 2 weeks ago
- A java library for creating standalone, portable, schema-full object databases supporting pagination and faceted search, and offering str…☆16Updated 8 years ago
- Suite of tools for detecting changes in web pages and their rendering☆54Updated last year
- Advanced desktop search/corpus exploration prototype☆21Updated 3 years ago
- Fast in-memory graph structure, powering Gephi☆75Updated 3 weeks ago
- This is the facade for installation and access to the individual components☆15Updated 7 years ago
- Common web archive utility code.☆55Updated 2 weeks ago
- Angular JS Solr and Elasticsearch and OpenSearch Diagnostic Search Services☆26Updated 3 months ago
- Simple search results with Solr and EmberJS☆58Updated 6 years ago
- Browser version of Hyphe (WIP)☆30Updated 3 weeks ago
- Javascript library to talk to multiple OLAP backends from multiple frontends☆17Updated 12 years ago
- Open source offering for the Logscape log management tool.☆28Updated 3 years ago
- an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)☆54Updated 7 years ago
- Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser☆13Updated 3 years ago
- Fess Site Search provides JavaScript files.☆23Updated 2 months ago
- Provides a modern and scalable web server as SIRIUS module☆22Updated last week
- Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, what…☆34Updated last week
- Database smell detector☆13Updated 7 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆83Updated 5 years ago
- Automatically exported from code.google.com/p/xml2json-xslt☆38Updated 10 years ago
- SOLR bulk indexing utility for the command line.☆44Updated 2 months ago
- Java library for computing structural differences between XML document trees☆22Updated 10 years ago
- ☆36Updated last year