apache / nutch
Apache Nutch is an extensible and scalable web crawler
☆3,013Updated last month
Alternatives and similar repositories for nutch
Users that are interested in nutch are comparing it to the libraries listed below
Sorting:
- Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.☆2,969Updated this week
- Open Source Web Crawler for Java☆4,591Updated 3 years ago
- WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup …☆3,072Updated 4 months ago
- Apache Lucene and Solr open-source search software☆4,379Updated 7 months ago
- Apache Shiro☆4,373Updated this week
- A scalable web crawler framework for Java.☆11,550Updated this week
- Apache Lucene open-source search software☆2,965Updated this week
- Apache ActiveMQ Classic☆2,351Updated last week
- Easy to use lightweight web crawler(易用的轻量化网络爬虫)☆2,514Updated last year
- Hibernate's core Object/Relational Mapping functionality☆6,149Updated this week
- Apache Commons Lang☆2,798Updated this week
- Ehcache 3.x line☆2,048Updated 4 months ago
- Apache ZooKeeper☆12,471Updated this week
- Mirror of Apache HttpClient☆1,489Updated this week
- The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).☆2,985Updated this week
- A scalable, mature and versatile web crawler based on Apache Storm☆907Updated this week
- Apache Tomcat☆7,819Updated this week
- Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-l…☆2,551Updated 7 months ago
- Drools is a rule engine, DMN engine and complex event processing (CEP) engine for Java.☆6,021Updated this week
- Code for Quartz Scheduler☆6,500Updated 3 weeks ago
- Apache HBase☆5,336Updated this week
- cglib - Byte Code Generation Library is high level API to generate and transform Java byte code. It is used by AOP, testing, data access …☆4,852Updated 9 months ago
- The reliable, generic, fast and flexible logging framework for Java.☆3,103Updated last month
- Apache Curator☆3,138Updated 3 weeks ago
- Mirror of Apache Mahout☆2,161Updated 3 weeks ago
- Spring integration for MyBatis 3☆2,868Updated 2 weeks ago
- Apache Solr open-source search software☆1,384Updated this week
- Jodd! Lightweight. Java. Zero dependencies. Use what you like.☆4,061Updated last year
- Apache Struts is a free, open-source, MVC framework for creating elegant, modern Java web applications☆1,312Updated this week
- The official MongoDB drivers for Java, Kotlin, and Scala☆2,635Updated this week