CSE601-DataMining / Clustering
Implement three clustering algorithms to find clusters of genes that exhibit similar expression profiles: K-means, Hierarchical Agglomerative clustering with Single Link (Min), and one from (density-based, mixture model, spectral). Set up a single-node Hadoop cluster on your machine and implement MapReduce K-means. Compare with non-parallel…
☆12Updated 10 years ago
Alternatives and similar repositories for Clustering:
Users that are interested in Clustering are comparing it to the libraries listed below
- Several implementation for building hbase secondary index.☆39Updated 9 years ago
- 计算两个特征向量的相似度☆26Updated 6 years ago
- 自助搭建的 hadoop + spark + kafka + zookeeper + storm + hbase + hive + flume 集群,一主两从。☆30Updated 6 years ago
- ☆11Updated 7 years ago
- 一个数据挖掘里的简单聚类算法,使用了JFreeChart用于对分类结果的展示。☆10Updated 9 years ago
- ☆21Updated 8 years ago
- 大数据【企业级360°全方位用户画像】标签开发部分源码☆19Updated 4 years ago
- 这是Word2vec和Doc2vec的一个应用示例:用Word2vec计算词的相似度和用doc2vec计算句子的相似度。☆26Updated 7 years ago
- Spark Streaming + kafka + hbase☆15Updated 6 years ago
- 基于Spark和Kubernetes的机器学习平台☆30Updated 7 years ago
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆24Updated 11 years ago
- Implementation of text clustering algorithms including K-means, MBSAS, DBSCAN.☆44Updated 7 years ago
- 使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引擎,在Spark on Yarn测试通过☆29Updated 8 years ago
- hbase+solr实现hbase的二级索引☆48Updated last month
- Spark 编程指南简体中文版☆33Updated 8 years ago
- 以知乎日报为数据源,全流程实践一个机器学习过程,从数据获取到数据分析,对知乎日报进行聚类、分类,并可视化这一过程☆17Updated 9 years ago
- 基于词典的负面舆情信息评分算法。☆26Updated 10 years ago
- UDF, GenericUDF, UDTF, UDAF☆12Updated 2 years ago
- SparkMLlib智慧交通项目☆15Updated 6 years ago
- Spark PMML 模型离线部署☆12Updated 2 years ago
- ☆12Updated 8 years ago
- DBSCAN clustering algorithm implemented in Apache Spark (MapReduce Framework).☆14Updated 8 years ago
- lyq算法库,涉及到相关数据挖掘,解压缩,模式匹配,图算法等多领域算法☆132Updated 9 years ago
- A Spark based semantic reasoning engine☆14Updated 8 years ago
- HanLP 测试☆16Updated 7 years ago
- Spark中实现用户画像系统价值度、忠诚度、流失预警、活跃度等模型☆66Updated 7 years ago
- 常用文本聚类算法java实现☆15Updated 10 years ago
- some code for spark☆17Updated 8 years ago
- Spark Mllib 1.6.0版本算法封装☆11Updated 8 years ago
- "Clustering by fast search and find of density peaks"是今年6月份在《Science》期刊上发表的的一篇论文,论文中提出了一种非常巧妙的聚类算法。经过几天的努力,终于用Java实现了文中的算法,下面与大家分享一下自己对算法…☆22Updated 10 years ago