Norconex / collector-filesystemLinks
Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
☆24Updated last year
Alternatives and similar repositories for collector-filesystem
Users that are interested in collector-filesystem are comparing it to the libraries listed below
Sorting:
- Automatically exported from code.google.com/p/xml2json-xslt☆38Updated 10 years ago
- Work in progress: a new visualization engine☆34Updated 5 months ago
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆196Updated last month
- Simple taxonomy management tool and document classifier.☆57Updated 6 years ago
- JSONiq Implementation that compiles to JavaScript☆67Updated 3 years ago
- Zorba - the NoSQL processor☆42Updated 2 years ago
- SOLR bulk indexing utility for the command line.☆45Updated 2 months ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- XSLTJSON - Convert XML to JSON using XSLT☆316Updated 3 years ago
- Visualization of interaction between entities☆16Updated 9 years ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆134Updated 3 months ago
- The open source tools for building, maintaining and deploying Topic Maps-based applications.☆57Updated last month
- Browser version of Hyphe (WIP)☆32Updated 8 months ago
- Index and search PDF files using Apache Lucene and PDF Box☆43Updated 3 months ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆277Updated 3 years ago
- The JSON discoverer allows you to discover the implicit schema of your JSON documents. Please visit the website to use the tool☆153Updated 3 years ago
- an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)☆54Updated 8 years ago
- NARA File Analyzer and Metadata Harvester☆111Updated 9 years ago
- Solrstrap is a Query-Result interface for Solr written in JavaScript, HTML and CSS☆87Updated 8 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆64Updated 5 months ago
- Blazegraph Samples with Sesame, Blueprints, and RDR☆72Updated 5 years ago
- Quickly analyze and explore email with advanced analytics and visualization.☆55Updated 4 years ago
- A pure JavaScript library for translating complex XML Schemas into JSON Schemas.☆55Updated 2 years ago
- JSONiq & XQuery Quality Checker☆51Updated last week
- PST extraction and analytic pipeline☆37Updated 7 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆47Updated 4 years ago
- Suite of tools for detecting changes in web pages and their rendering☆55Updated 2 years ago
- The HTML5 PivotViewer is a fork of a project that was started by LobsterPot Solutions as a cross browser, cross platform version of the S…☆127Updated last year
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆17Updated 10 years ago
- The smart and simple way to automate document assembly☆408Updated 7 years ago