Norconex / collector-filesystemLinks
Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
☆23Updated last year
Alternatives and similar repositories for collector-filesystem
Users that are interested in collector-filesystem are comparing it to the libraries listed below
Sorting:
- Simple taxonomy management tool and document classifier.☆56Updated 5 years ago
- a pure javascript frontend for ElasticSearch search indices.☆80Updated 7 years ago
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆194Updated 2 weeks ago
- Neddick: Open Source Information Discovery Platform☆36Updated 2 years ago
- An open source search engine for corporate data and websites.☆107Updated 8 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆62Updated 2 months ago
- The smart and simple way to automate document assembly☆408Updated 7 years ago
- SOLR bulk indexing utility for the command line.☆45Updated last week
- The open source tools for building, maintaining and deploying Topic Maps-based applications.☆57Updated last month
- Browser version of Hyphe (WIP)☆31Updated 5 months ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆274Updated 3 years ago
- Visualization of interaction between entities☆16Updated 8 years ago
- ☆138Updated 2 years ago
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.☆131Updated 2 weeks ago
- FacetView is a pure javascript frontend for ElasticSearch.☆291Updated 10 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆47Updated 3 years ago
- A web application for digital assets management.☆53Updated 4 years ago
- Suite of tools for detecting changes in web pages and their rendering☆55Updated last year
- A Relaxed Schema Graph Database Management System☆52Updated 5 years ago
- JSONiq Implementation that compiles to JavaScript☆66Updated 3 years ago
- Open-source Enterprise Grade Search Engine Software☆511Updated 3 years ago
- Work in progress: a new visualization engine☆34Updated last month
- Index and search PDF files using Apache Lucene and PDF Box☆44Updated 3 months ago
- Tools for exploring the contents of web archive files.☆40Updated 5 years ago
- Solrstrap is a Query-Result interface for Solr written in JavaScript, HTML and CSS☆87Updated 8 years ago
- This is the facade for installation and access to the individual components☆15Updated 7 years ago
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆16Updated 10 years ago
- Scruffy micro web server to have your own UML class/sequence diagram page like yUML and even more lean.☆44Updated 6 years ago
- The JSON discoverer allows you to discover the implicit schema of your JSON documents. Please visit the website to use the tool☆154Updated 2 years ago
- A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.☆150Updated 13 years ago