codelibs / fess-crawler
Web/FileSystem Crawler Library
☆29Updated this week
Related projects ⓘ
Alternatives and complementary repositories for fess-crawler
- This plugin provides a useful feature for multi-language☆13Updated 2 years ago
- Fione is Enterprise AI Platform☆15Updated last year
- Open Source, Distributed, Big Data Enterprise Search Engine☆69Updated last week
- Skeleton for Meetup - Building your own recommendation engine in an hour☆29Updated 3 years ago
- Elasticsearch plugin for b-bit minhash algorism☆62Updated 5 months ago
- learning related projects☆18Updated 9 years ago
- Web Crawler for Elasticsearch☆234Updated 5 years ago
- Visualization of result returning by Solr 6 graph query☆10Updated 8 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆94Updated 6 years ago
- Solr Relevance Ranking Analysis and Visualization Tool☆17Updated 5 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆46Updated 2 years ago
- Apache OpenNLP Sandbox☆42Updated this week
- Mirror of Apache ManifoldCF☆77Updated last month
- A POC at replicating Facebook Graph Search with Cypher and Neo4j☆102Updated 11 years ago
- Suite of tools for detecting changes in web pages and their rendering☆53Updated 11 months ago
- Zulia Search Engine☆29Updated 3 weeks ago
- Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser☆13Updated 2 years ago
- Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.☆44Updated last month
- Big GeoSpatial Data Points Visualization Tool☆19Updated 8 years ago
- A custom SimilarityProvider example for Elasticsearch☆36Updated 9 years ago
- Develop streaming applications for IBM Streams in Python, Java & Scala.☆29Updated 2 years ago
- Twitter River Plugin for elasticsearch (STOPPED)☆204Updated 3 months ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 3 years ago
- A Text Classification API in Java originally developed by DigitalPebble Ltd. The API is independent from the ML implementations used and …☆48Updated 3 years ago
- Document Enrichment plugin for Elasticsearch☆28Updated 2 weeks ago
- Wikipedia River Plugin for elasticsearch (STOPPED)☆74Updated last year
- Common web archive utility code.☆50Updated last month
- Parsing and extracting information from (possibly malformed) HTML/XML documents☆9Updated 6 months ago
- This is a REST Server endpoint built using Flask and Python.☆24Updated 2 years ago
- An open source search engine for corporate data and websites.☆107Updated 7 years ago