WebHunger is an extensible, full-scale crawler framework that supports distributed crawling, aiming at getting users focused on web page parsing without concerning for the crawling process.
☆18Apr 11, 2018Updated 8 years ago
Alternatives and similar repositories for webhunger
Users that are interested in webhunger are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 恩布即时通讯软件,企业协同办公平台 http://www.entboost.com☆11Jan 22, 2017Updated 9 years ago
- spider of doubanbook☆10Jun 21, 2017Updated 8 years ago
- 消息系统☆15Jan 19, 2017Updated 9 years ago
- 使用倒排索引及二分法实现了一个简单的规则匹配☆18Mar 14, 2019Updated 7 years ago
- Sync是一款分布式场景下基于Redis的安全高效的线程同步组件,提供分布式可重入互斥锁、分布式可重入读写锁、分布式信号量。提供相应注解,使用简单,可与spring-boot无缝集成。☆13Oct 8, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- 分布式爬虫框架,基于webdrvier模拟用户请求,kafka消息传递,分布式网页存储使用hbase,task异步任务多线程解析,提供基础服务如:proxy ip服务和号码验证服务等, proxy page使用H5和we版进行接入☆13Dec 18, 2015Updated 10 years ago
- Semantic Parser with Execution☆13Dec 8, 2017Updated 8 years ago
- A (massive) DNS tools (reverse lookup for now)☆12Jul 6, 2022Updated 3 years ago
- Just a DEMO to demonstrate how to use JNA to type chars into alipay's password edit control automatically.☆12Dec 21, 2017Updated 8 years ago
- Sample AWS Batch project to read CSV files☆11Oct 22, 2017Updated 8 years ago
- 简单状态机实现。同时以简化的订单状态机为例子进行了说明。☆16Oct 13, 2020Updated 5 years ago
- ☆13Jun 14, 2016Updated 9 years ago
- Swip - Plugin for IntelliJ IDEA that can create a fully functional (Spring Boot) WebApp with just a few clicks☆13Jan 4, 2020Updated 6 years ago
- 每天三分钟的科技新闻聚合阅读☆18May 15, 2018Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A module that processes new Edgar filings and sends out notifications☆14Dec 28, 2015Updated 10 years ago
- Anything we need to maintain the Linked Open Data (LOD) publication of CEUR-WS.org☆16Jun 10, 2020Updated 5 years ago
- Detect memory leaks in minutes without a heap dump.☆17Apr 7, 2017Updated 9 years ago
- Predict the Race of a Given Surname Using Census Data☆13Jul 5, 2023Updated 2 years ago
- NYC Data Science Academy capstone project - build event driven financial model using deep learning artificial neural network.☆15Mar 27, 2017Updated 9 years ago
- ner using crf++☆10Mar 24, 2015Updated 11 years ago
- Java SDK for the TextRazor Text Analytics API☆16Mar 2, 2026Updated 2 months ago
- 爬虫抓取框架,封装HttpClient,Htmlunit,Selenium等工具☆26Nov 15, 2018Updated 7 years ago
- The distributed statistical machine translation infrastructure consisting of load balancing, text pre/post-processing and translation ser…☆12Nov 29, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- JavaAgent内存马实现、检测、修复demo☆11Dec 7, 2022Updated 3 years ago
- Keyword Extraction system using Brown Clustering - (This version is trained to extract keywords from job listings)☆18Sep 16, 2014Updated 11 years ago
- Spring Boot Starter For Netty-socketio☆73May 16, 2026Updated last week
- an idiomatic port of FlashText.py to Java using streams☆14Sep 27, 2024Updated last year
- Fine-grained named entity recognition using BERT☆11Feb 5, 2020Updated 6 years ago
- API - extract a list of keywords from a text.☆18Jul 6, 2017Updated 8 years ago
- black Ip lists, dorks-collection☆17May 1, 2026Updated 3 weeks ago
- 数据平台(DataPlateform),最初的设计想法是:当今大数据横行,我们也不能落后。所以就想着写一个这样的平台系统。此项目集爬虫、搜索、Hadoop、Dwr推送、Quartz定时任务于一体的平台,其目的是想通过抓取互联网数据,通过大数据推测人或者某一事物的下一行为。C…☆18Jul 31, 2017Updated 8 years ago
- Grawlox is a profanity filter which offers methods for detecting and replacing swearwords.☆11Jul 16, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- DJIA index prices of 10 years and NYtimes news articles headline has been used to predict the DJIA index prices☆18Feb 21, 2018Updated 8 years ago
- Apache NiFi NLP Processor☆18May 8, 2026Updated 3 weeks ago
- Google 在 2018 年下旬开源了一款新的 Java 工具 Jib,可以轻松地将 Java 应用程序容器化。通过 Jib,我们不需要编写 Dockerfile 或安装 Docker,通过集成到 Maven 或 Gradle 插件,就可以立即将 Java 应用程序容器化…☆21Apr 7, 2019Updated 7 years ago
- Extra pluggable modules for Apache MetaModel (but licensed with LGPL)☆17Jan 2, 2022Updated 4 years ago
- IOC, AOP, REST...☆14Apr 4, 2017Updated 9 years ago
- ☆13Jul 16, 2023Updated 2 years ago
- Neural Machine Translation of WordPress Strings☆21Dec 8, 2022Updated 3 years ago