codelibs / fess-crawlerLinks
Web/FileSystem Crawler Library
☆28Updated last week
Alternatives and similar repositories for fess-crawler
Users that are interested in fess-crawler are comparing it to the libraries listed below
Sorting:
- Elasticsearch plugin for b-bit minhash algorism☆62Updated last year
- Web Crawler for Elasticsearch☆234Updated 6 years ago
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆196Updated last week
- Document Enrichment plugin for Elasticsearch☆27Updated 8 months ago
- Pulsar Data Visualization, gets the data from Pulsar Reporting API, builds different charts and displays them in the browser.☆53Updated 10 years ago
- Elasticsearch plugin offering Neo4j integration for Personalized Search☆157Updated 4 years ago
- Apache NiFi NLP Processor☆18Updated 2 years ago
- Twitter sentiment analysis using Spark and Stanford CoreNLP and visualization using elasticsearch and kibana☆20Updated 7 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆96Updated 7 years ago
- Neo4j ElasticSearch Integration☆214Updated 5 years ago
- The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)☆221Updated 2 years ago
- Open-source Enterprise Grade Search Engine Software☆510Updated 3 years ago
- Suite of tools for detecting changes in web pages and their rendering☆55Updated last year
- Text retrieval database based on simhash similarity search☆24Updated 2 years ago
- Metl is a simple, web-based integration platform that allows for several different styles of data integration including messaging, file b…☆211Updated 2 weeks ago
- Implementation of Vision Based Page Segmentation algorithm in Java☆103Updated 6 years ago
- Storm / Solr Integration☆19Updated last year
- Apache OpenNLP Sandbox☆44Updated this week
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆275Updated 3 years ago
- sql interface for solr cloud☆40Updated 3 years ago
- A web based data mining workflow platform with real-time analysis capabilities☆49Updated 3 years ago
- This project deals with hierarchical classification of web pages based on dmoz dataset.☆14Updated 11 years ago
- Skeleton for Meetup - Building your own recommendation engine in an hour☆29Updated 4 years ago
- This plugin provides a feature to change top N documents in a search result.☆56Updated 2 years ago
- ☆11Updated 10 years ago
- Clone version of LingPipe 4.1.0, with support for unsupervised training☆32Updated 12 years ago
- Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.☆46Updated last week
- A toolkit for clustering web pages based on various similarity measures.☆34Updated 4 years ago
- Parse wikipedia dumps and index (some) page data to elasticsearch☆49Updated 10 years ago
- Easy way to get structured stuff into Elasticsearch (CSV, MSSQL, API)☆88Updated 5 years ago