Norconex / collector-filesystemLinks
Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
☆24Updated last year
Alternatives and similar repositories for collector-filesystem
Users that are interested in collector-filesystem are comparing it to the libraries listed below
Sorting:
- Suite of tools for detecting changes in web pages and their rendering☆55Updated 2 years ago
- JSONiq Implementation that compiles to JavaScript☆66Updated 3 years ago
- ☆36Updated 2 years ago
- IMAP and POP3 email importer for Elasticsearch (no river anymore)☆102Updated 3 years ago
- Work in progress: a new visualization engine☆34Updated 3 months ago
- an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)☆54Updated 8 years ago
- Browser version of Hyphe (WIP)☆32Updated 7 months ago
- Simple taxonomy management tool and document classifier.☆56Updated 5 years ago
- Javascript library to talk to multiple OLAP backends from multiple frontends☆17Updated 12 years ago
- Uses your app logs to visualize how the data moves between the code, database, HTTP services, message queue, external storages etc.☆23Updated last year
- A web application for digital assets management.☆54Updated 4 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 13 years ago
- Zorba - the NoSQL processor☆42Updated 2 years ago
- This is the facade for installation and access to the individual components☆15Updated last week
- A collection of datasets and databases☆24Updated 7 years ago
- Index and search PDF files using Apache Lucene and PDF Box☆43Updated 2 months ago
- Open-source Enterprise Grade Search Engine Software☆512Updated 3 years ago
- Home of The Fr8 Project☆43Updated 8 years ago
- ConfrontaPDF compares PDF files, GUI or command line☆14Updated 3 years ago
- Explore networks and publish narratives.☆52Updated 5 years ago
- ☆139Updated 2 years ago
- The open source tools for building, maintaining and deploying Topic Maps-based applications.☆57Updated 2 weeks ago
- XML Director - XML Content Management☆16Updated last year
- SOLR bulk indexing utility for the command line.☆45Updated last month
- Simple search results with Solr and EmberJS☆58Updated 6 years ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆276Updated 3 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆63Updated 4 months ago
- Quickly analyze and explore email with advanced analytics and visualization.☆55Updated 4 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆47Updated 4 years ago
- An open source search engine for corporate data and websites.☆107Updated 8 years ago