apache / nutchLinks
Apache Nutch is an extensible and scalable web crawler
☆3,043Updated this week
Alternatives and similar repositories for nutch
Users that are interested in nutch are comparing it to the libraries listed below
Sorting:
- Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.☆2,999Updated last week
- Open Source Web Crawler for Java☆4,594Updated 3 years ago
- WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup …☆3,075Updated 6 months ago
- A scalable, mature and versatile web crawler based on Apache Storm☆921Updated this week
- Easy to use lightweight web crawler(易用的轻量化网络爬虫)☆2,514Updated last year
- Apache Lucene and Solr open-source search software☆4,372Updated 9 months ago
- A scalable web crawler framework for Java.☆11,583Updated last week
- Apache log4j1☆868Updated 2 years ago
- Mirror of Apache Mahout☆2,171Updated last week
- Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more☆3,960Updated last week
- Mirror of Apache HttpClient☆1,494Updated this week
- Ehcache 3.x line☆2,053Updated last month
- Enterprise Stream Process Engine☆3,891Updated 2 years ago
- Apache Shiro☆4,388Updated this week
- A configurable web spider with a easy-to-use web console☆998Updated 6 years ago
- Apache Commons Lang☆2,812Updated this week
- Apache ActiveMQ Classic☆2,369Updated last week
- cglib - Byte Code Generation Library is high level API to generate and transform Java byte code. It is used by AOP, testing, data access …☆4,858Updated 10 months ago
- Apache HBase☆5,357Updated this week
- Elasticsearch Java Rest Client.☆2,112Updated 2 years ago
- 一个简单、敏捷、分布式的支持SpringBoot的Java爬虫框架;An agile, distributed crawler framework.☆1,989Updated 7 months ago
- a mature, highly concurrent JDBC Connection pooling library, with support for caching and reuse of PreparedStatements.☆1,303Updated 2 weeks ago
- JAVA WEB + ORM Framework☆3,247Updated this week
- Apache Curator