apache / nutch
Apache Nutch is an extensible and scalable web crawler
☆2,886Updated this week
Related projects: ⓘ
- Open Source Web Crawler for Java☆4,533Updated 2 years ago
- Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.☆2,779Updated last week
- WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup …☆3,067Updated 5 months ago
- Apache Lucene and Solr open-source search software☆4,366Updated this week
- Apache Curator☆3,101Updated last week
- Apache HBase☆5,199Updated this week
- Ehcache 3.x line☆2,007Updated last month
- Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-l…☆2,523Updated 4 months ago
- Mirror of Apache ActiveMQ☆2,296Updated this week
- Enterprise Stream Process Engine☆3,915Updated last year
- Apache ZooKeeper☆12,147Updated this week
- Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more☆3,832Updated this week
- ☆4,797Updated this week
- Apache Storm☆6,590Updated 2 weeks ago
- Mirror of Apache Mahout☆2,133Updated 2 weeks ago
- A scalable, mature and versatile web crawler based on Apache Storm☆879Updated this week
- Apache Shiro☆4,304Updated this week
- cglib - Byte Code Generation Library is high level API to generate and transform Java byte code. It is used by AOP, testing, data access …☆4,784Updated last month
- Easy to use lightweight web crawler(易用的轻量化网络爬虫)☆2,500Updated 6 months ago
- A scalable web crawler framework for Java.☆11,378Updated last month
- Do not send pull requests! Automated Git clone of various OpenJDK branches☆2,166Updated 4 years ago
- Mirror of Apache HttpClient☆1,454Updated this week
- Code for Quartz Scheduler☆6,238Updated last month
- Elasticsearch Java Rest Client.☆2,112Updated last year
- ZooKeeper client wrapper and rich ZooKeeper framework☆2,156Updated last year
- Java serialization library, proto compiler, code generator☆2,041Updated 10 months ago
- Drools is a rule engine, DMN engine and complex event processing (CEP) engine for Java.☆5,827Updated this week
- Benchmark comparing serialization libraries on the JVM☆3,286Updated 11 months ago
- Redis Java client☆11,792Updated this week
- Advanced Java Redis client for thread-safe sync, async, and reactive usage. Supports Cluster, Sentinel, Pipelining, and codecs.☆5,355Updated this week