CSE601-DataMining / Clustering
Implement three clustering algorithms to find clusters of genes that exhibit similar expression profiles: K-means, Hierarchical Agglomerative clustering with Single Link (Min), and one from (density-based, mixture model, spectral). Set up a single-node Hadoop cluster on your machine and implement MapReduce K-means. Compare with non-parallel…
☆12Updated 10 years ago
Alternatives and similar repositories for Clustering:
Users that are interested in Clustering are comparing it to the libraries listed below
- 自助搭建的 hadoop + spark + kafka + zookeeper + storm + hbase + hive + flume 集群,一主两从。☆30Updated 6 years ago
- ☆11Updated 8 years ago
- Several implementation for building hbase secondary index.☆39Updated 9 years ago
- 这是Word2vec和Doc2vec的一个应用示例:用Word2vec计算词的相似度和用doc2vec计算句子的相似度。☆26Updated 7 years ago
- 大数据【企业级360°全方位用户画像】标签开发部分源码☆19Updated 4 years ago
- hbase+solr实现hbase的二级索引☆48Updated last month
- High Performance Spark Streaming with Direct Kafka in Java☆39Updated 8 years ago
- Spark中实现用户画像系统价值度、忠诚度、流失预警、活跃度等模型☆66Updated 7 years ago
- 基于词典的负面舆情信息评分算法。☆26Updated 10 years ago
- R 语言实现的常用的推荐算法itemCF,UserCF ,Tags,SVD,Apriori☆18Updated 8 years ago
- Refactored version for https://github.com/shirdrn/document-processor.git☆15Updated 8 years ago
- ☆15Updated 5 years ago
- 推荐算法☆30Updated 9 years ago
- 文本去重算法,研究自推荐系统中新闻的 去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆24Updated 11 years ago
- 基于Hadoop和HBase的大规模海量数据去重☆29Updated 7 years ago
- 大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件☆39Updated last year
- 基于spark-ml,spark-mllib,spark-streaming的推荐算法实现☆96Updated 5 years ago
- UDF, GenericUDF, UDTF, UDAF☆12Updated 2 years ago
- spark mllib example☆28Updated 9 years ago
- Spark1.6和spark2.2的示例,包含kafka,flume,structuredstreaming,jedis,elasticsearch,mysql,dataframe☆15Updated 7 years ago
- 新词发现分布式机器学习算法。☆15Updated 10 years ago
- Kafka Eagle used to describe the use of Wiki☆11Updated 5 years ago
- Showcase for our blog entry about Spring Data Neo4j.☆31Updated 11 years ago
- 一个数据挖掘里的简单聚类算法,使用了JFreeChart用于对分类结果的展示。☆11Updated 9 years ago
- 数据清洗系统;hadoop;实体识别;冲突消解;不一致修复;缺失值填充☆17Updated 9 years ago
- Spark 编程指南简体中文版☆33Updated 8 years ago
- 分布式数据仓库最佳实践☆57Updated 7 years ago
- hadoop hbase use case and examples, inclusing MR,HBaseUtil...☆36Updated 11 years ago
- spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)☆83Updated 5 years ago
- Spark Streaming + kafka + hbase☆15Updated 6 years ago