codelibs / fess-crawlerLinks
Web/FileSystem Crawler Library
☆34Updated last week
Alternatives and similar repositories for fess-crawler
Users that are interested in fess-crawler are comparing it to the libraries listed below
Sorting:
- Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or fi…☆196Updated this week
- Pulsar Data Visualization, gets the data from Pulsar Reporting API, builds different charts and displays them in the browser.☆53Updated 10 years ago
- Apache NiFi NLP Processor☆18Updated 2 years ago
- Twitter sentiment analysis using Spark and Stanford CoreNLP and visualization using elasticsearch and kibana☆20Updated 8 years ago
- sql interface for solr cloud☆40Updated 3 years ago
- Apache OpenNLP Sandbox☆46Updated this week
- Elasticsearch plugin for b-bit minhash algorism☆62Updated last year
- Web Crawler for Elasticsearch☆235Updated 6 years ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆277Updated 3 years ago
- Orchestration, Management and Monitoring of Data Processing☆10Updated last week
- Distributed Elastic Message Processing System☆196Updated 2 years ago
- Text retrieval database based on simhash similarity search☆25Updated 2 years ago
- Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.☆46Updated last week
- An Elasticsearch plugin to return query results as either PDF,HTML or CSV.☆48Updated 7 years ago
- Document Enrichment plugin for Elasticsearch☆28Updated 10 months ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆95Updated 7 years ago
- The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)☆222Updated 3 years ago
- Suite of tools for detecting changes in web pages and their rendering☆55Updated 2 years ago
- Vert.x web and commandline application to import CSV/XLS/XLSX files into ElasticSearch.☆119Updated 5 years ago
- 蜜蜂牧场是一个数据采集清洗工具,也是一个ETL工具,同时也是一套脚本语言。☆14Updated 7 years ago
- Skeleton for Meetup - Building your own recommendation engine in an hour☆29Updated 4 years ago
- Eclipse Keti is a service that was designed to protect RESTfuls API using Attribute Based Access Control (ABAC).☆29Updated last year
- Lightweight embedded java full text search engine☆13Updated 5 years ago
- Easy way to get structured stuff into Elasticsearch (CSV, MSSQL, API)☆88Updated 5 years ago
- Big GeoSpatial Data Points Visualization Tool☆19Updated 9 years ago
- A tool for developing and testing ETL and ELT processes for automating the capture, delivery and processing of information in data wareho…☆59Updated 2 years ago
- ☆11Updated 10 years ago
- Additional convenience processors not found in core Apache NiFi☆97Updated 3 years ago
- Zulia Search Engine☆34Updated last week
- Elasticsearch plugin offering Neo4j integration for Personalized Search☆158Updated 4 years ago