apache / nutch
Apache Nutch is an extensible and scalable web crawler
☆2,971Updated last month
Alternatives and similar repositories for nutch:
Users that are interested in nutch are comparing it to the libraries listed below
- Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.☆2,901Updated last week
- Open Source Web Crawler for Java☆4,574Updated 3 years ago
- Apache Shiro☆4,357Updated this week
- Apache ActiveMQ Classic☆2,340Updated last week
- WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup …☆3,076Updated last month
- Drools is a rule engine, DMN engine and complex event processing (CEP) engine for Java.☆5,949Updated this week
- Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-l…☆2,547Updated 4 months ago
- A scalable, mature and versatile web crawler based on Apache Storm☆901Updated this week
- Apache Lucene and Solr open-source search software☆4,367Updated 4 months ago
- A scalable web crawler framework for Java.☆11,494Updated last week
- Ehcache 3.x line☆2,034Updated last month
- Enterprise Stream Process Engine☆3,905Updated last year
- Apache Storm☆6,613Updated this week
- Apache Curator☆3,132Updated this week
- Easy to use lightweight web crawler(易用的轻量化网络爬虫)☆2,510Updated last year
- Spring integration for MyBatis 3☆2,853Updated this week
- Apache Tomcat☆7,718Updated this week
- Mirror of Apache HttpClient☆1,477Updated this week
- Apache log4j1☆872Updated 2 years ago
- A UI dashboard that allows CRUD operations on Zookeeper.☆2,371Updated last year
- a mature, highly concurrent JDBC Connection pooling library, with support for caching and reuse of PreparedStatements.☆1,296Updated this week
- Apache ZooKeeper☆12,384Updated last week
- Redis Java client☆11,975Updated this week
- Do not send pull requests! Automated Git clone of various OpenJDK branches☆2,163Updated 4 years ago
- cglib - Byte Code Generation Library is high level API to generate and transform Java byte code. It is used by AOP, testing, data access …☆4,829Updated 6 months ago
- Azkaban workflow manager.☆4,489Updated 7 months ago
- Apache HBase☆5,282Updated this week
- Mirror of Apache PDFBox☆2,750Updated this week
- MyBatis integration with Spring Boot☆4,187Updated this week
- The Definitive Guide to Elasticsearch☆3,564Updated 3 years ago