Norconex / collector-filesystemLinks
Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
☆23Updated 11 months ago
Alternatives and similar repositories for collector-filesystem
Users that are interested in collector-filesystem are comparing it to the libraries listed below
Sorting:
- an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)☆54Updated 7 years ago
- Simple taxonomy management tool and document classifier.☆56Updated 5 years ago
- ☆36Updated last year
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆46Updated 3 years ago
- The smart and simple way to automate document assembly☆408Updated 7 years ago
- Work in progress: a new visualization engine☆34Updated 2 weeks ago
- JSONiq Implementation that compiles to JavaScript☆66Updated 3 years ago
- This is the facade for installation and access to the individual components☆15Updated 7 years ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- XML Director - XML Content Management☆16Updated last year
- Browser version of Hyphe (WIP)☆31Updated 4 months ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆271Updated 2 years ago
- Explore networks and publish narratives.☆52Updated 4 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visuali…☆86Updated 5 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆62Updated last month
- The open source tools for building, maintaining and deploying Topic Maps-based applications.☆57Updated 2 weeks ago
- Highly performant version of open-text-summarizer☆38Updated 11 years ago
- BaseX Distribution Files☆21Updated 3 months ago
- Neddick: Open Source Information Discovery Platform☆36Updated 2 years ago
- Suite of tools for detecting changes in web pages and their rendering☆55Updated last year
- JSONiq & XQuery Quality Checker☆51Updated 3 weeks ago
- SOLR bulk indexing utility for the command line.☆44Updated 2 months ago
- The HTML5 PivotViewer is a fork of a project that was started by LobsterPot Solutions as a cross browser, cross platform version of the S…☆125Updated 8 months ago
- Visualization of interaction between entities☆16Updated 8 years ago
- Segrada - Semantic Graph Database☆71Updated 5 months ago
- An open source search engine for corporate data and websites.☆106Updated 8 years ago
- ☆138Updated 2 years ago
- A cross-platform command line tool for parallelised content extraction and analysis.☆249Updated last month
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- Quickly analyze and explore email with advanced analytics and visualization.☆56Updated 3 years ago