This is a Java library which can be used to crawl the content of some of web properties (www.salesforce.com, blogs.salesforce.com for example). It supports dynamic scaling (depending on available machine power (CPU, RAM) and network capacity) out of the box. It also has a Plugin structure, which allows others to write code (plugins) that act on …
☆25May 15, 2025Updated 9 months ago
Alternatives and similar repositories for SiteCrawler
Users that are interested in SiteCrawler are comparing it to the libraries listed below
Sorting:
- Cyberinfrastructure Shell (CIShell) is an open source, community-driven framework/application for the integration and utilization of data…☆31Nov 28, 2018Updated 7 years ago
- Secure REST service to index, search, retrieve and aggregate content from heterogeneous sources.☆20Oct 3, 2024Updated last year
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Sep 14, 2016Updated 9 years ago
- Axon Ivy Portal☆12Updated this week
- YARE is a rules engine for Java that simplifies processing rules☆25Jul 12, 2025Updated 7 months ago
- The core modules and the platform☆36Aug 15, 2023Updated 2 years ago
- Software in this repository is not maintained anymore☆11Jul 6, 2022Updated 3 years ago
- Percussion CMS - Content Management System☆14Jul 16, 2025Updated 7 months ago
- Apex Toolkit code examples☆12Apr 14, 2025Updated 10 months ago
- The next generation of open source search☆94May 25, 2017Updated 8 years ago
- MinorThird is a collection of Java classes for storing text, annotating text, and learning to extract entities and categorize text.☆58Feb 2, 2018Updated 8 years ago
- gvNIX project☆42Mar 2, 2016Updated 10 years ago
- A SCADA system that uses prime for intrusion tolerance. Using PVBrowser as an HMI☆10May 27, 2015Updated 10 years ago
- Cloud Mining automatically builds exploratory faceted search systems.☆52Oct 15, 2013Updated 12 years ago
- MLX90640/41 python driver☆12Dec 18, 2020Updated 5 years ago
- Project content to accompany the Adobe Marketing Summit developer lab☆11Jun 15, 2015Updated 10 years ago
- ☆12Aug 26, 2022Updated 3 years ago
- Reindexer's java connector☆12Feb 16, 2026Updated 2 weeks ago
- The goal of this experiment is to take articles and certain metadata and group them by topic.☆11Apr 14, 2016Updated 9 years ago
- Port of BaseFlight (with MultiWii 2.3 features) for STM32F4DISCOVERY board + GY-86 (mpu6050 + hmc5883 + ms5611) sensors board☆15Feb 3, 2014Updated 12 years ago
- Page Objects made easy for Playwright☆11Jun 20, 2022Updated 3 years ago
- Simple MySQL Async/Await Connection Pool☆10Dec 8, 2022Updated 3 years ago
- ☆10Dec 14, 2023Updated 2 years ago
- ☆11May 10, 2022Updated 3 years ago
- XR21B1411 (B1411) driver for Orange Pi Zero with forced RS-485 mode☆12Jun 18, 2018Updated 7 years ago
- ☆12Oct 25, 2015Updated 10 years ago
- A python library for the YDLidar X2☆11Sep 22, 2021Updated 4 years ago
- FORTRAN Unit Test Suite written in FORTRAN☆12Nov 9, 2025Updated 3 months ago
- Example project using LVGL, TFT_eSPI, PlatformIO and Arduino framework on the Espressif Esp32 Box board.☆11Apr 28, 2023Updated 2 years ago
- Documentation website source code for Concord☆15Feb 10, 2026Updated 3 weeks ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 3 months ago
- Elmer/Ice course repository containing example cases and slide material.☆14Sep 29, 2025Updated 5 months ago
- Stepper motor balancing robot project using NodeMCU32S and ESP32CAM coding with C++.☆11Sep 14, 2022Updated 3 years ago
- Master control of robot using esp32 chip with openmv and tensorflow-lite support.☆11Mar 6, 2023Updated 2 years ago
- Green SqlAlchemy extensions for pulsar☆11Nov 24, 2017Updated 8 years ago
- Digitization information system build on top of Fedora repository☆16Jan 15, 2019Updated 7 years ago
- ☆10Aug 27, 2022Updated 3 years ago
- Espressif ESP32S3 development board, ESP32S3-MINI development board is built with Espressif ESP32S3-H4R2.☆12Feb 21, 2024Updated 2 years ago
- A random data generator to produce realistic data files for multiple file types (e.g. csv, log, json)☆14May 18, 2023Updated 2 years ago