pengshuang / BDAP
☆11Updated this week
Related projects: ⓘ
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆22Updated 10 years ago
- ☆29Updated 8 years ago
- 使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引擎,在Spark on Yarn测试通过☆28Updated 7 years ago
- ☆11Updated this week
- web analysis and visualization for PPD Magic Mirror Contest☆44Updated 7 years ago
- Spark 编程指南简体中文版☆33Updated 8 years ago
- 分布式爬虫框架,基于webdrvier模拟用户请求,kafka消息传递,分布式网页存储使用hbase,task异步任务多线程解析,提供基础服务如:proxy ip服务和号码验证服务等, proxy page使用H5和we版进行接入☆13Updated 8 years ago
- streaming-app☆8Updated 9 years ago
- 关于Spark的源码分析,以及平时工作的一些总结☆31Updated 8 years ago
- some ml demo(based on sklearn)☆12Updated 8 years ago
- Spark机器学习书代码☆26Updated 6 years ago
- ☆26Updated this week
- ☆14Updated this week
- ☆36Updated this week
- 总结了一些Spark学习过程中的例子(附代码详细注释)☆24Updated 6 years ago
- High Performance Spark Streaming with Direct Kafka in Java☆39Updated 8 years ago
- 阿里巴巴大数据竞赛☆62Updated 10 years ago
- 2013,05-2015,02 产品评论情感分析☆15Updated 9 years ago
- dw etl 工具 mysql 增量、全量抽取 to hive. 合并 hive 数据表, 等数据平台清洗工具☆9Updated 7 years ago
- 微博情感分析☆12Updated 11 years ago
- ☆84Updated 7 years ago
- ☆52Updated 8 years ago
- 基于知识图谱技术的搜素引擎研发☆20Updated 7 years ago
- SparkSQL数据分析案例☆23Updated 7 years ago
- News recommendation system based on spark.☆47Updated 7 years ago
- Kafka, Storm, Zookeeper, and Openfire running in Docker☆14Updated 8 years ago
- ☆15Updated this week
- A minimal display advertising system.☆47Updated 7 years ago
- 一些机器学习的实践☆10Updated 2 years ago
- ☆24Updated this week