CSE601-DataMining / Clustering
Implement three clustering algorithms to find clusters of genes that exhibit similar expression profiles: K-means, Hierarchical Agglomerative clustering with Single Link (Min), and one from (density-based, mixture model, spectral). Set up a single-node Hadoop cluster on your machine and implement MapReduce K-means. Compare with non-parallel…
☆11Updated 10 years ago
Alternatives and similar repositories for Clustering:
Users that are interested in Clustering are comparing it to the libraries listed below
- 自助搭建的 hadoop + spark + kafka + zookeeper + storm + hbase + hive + flume 集群,一主两从。☆30Updated 6 years ago
- 这是Word2vec和Doc2vec的一个应用示例:用Word2vec计算词的相似度和用doc2vec计算句子的相似度。☆26Updated 7 years ago
- Spark中实现用户画像系统价值度、忠诚度、流失预警、活跃度等模型☆66Updated 7 years ago
- ☆11Updated 7 years ago
- 各种安全相关思维导图整理收集☆11Updated 9 years ago
- Several implementation for building hbase secondary index.☆39Updated 8 years ago
- ☆21Updated 8 years ago
- 计算两个特征向量的相似度☆26Updated 5 years ago
- 基于hanlp工具包的es分词插件☆10Updated 6 years ago
- 基于Spark和Kubernetes的机器学习平台☆30Updated 6 years ago
- Spark 编程指南简体中文版☆33Updated 8 years ago
- 大数据【企业级360°全方位用户画像】标签开发部分源码☆19Updated 4 years ago
- Spark Streaming + kafka + hbase☆15Updated 6 years ago
- 基于Hadoop和HBase的大规模海量数据去重☆29Updated 6 years ago
- 常见数据挖掘和机器学习算法☆33Updated 11 years ago
- 使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引擎,在Spark on Yarn测试通过☆29Updated 7 years ago
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆23Updated 10 years ago
- 推荐算法☆30Updated 9 years ago
- hbase+solr实现hbase的二级索引☆47Updated 3 years ago
- ☆12Updated 7 years ago
- dw etl 工具 mysql 增量、全量抽取 to hive. 合并 hive 数据表, 等数据平台清洗工具☆9Updated 8 years ago
- Refactored version for https://github.com/shirdrn/document-processor.git☆15Updated 7 years ago
- A Web Page Of Public Sentiment For P2P Industry( P2P 行业的舆情分析前端展示)☆25Updated 8 years ago
- 使用Spark Graphx 分析金庸”射雕三部曲“☆46Updated 4 years ago
- ☆15Updated 5 years ago