yingrui / mahjong
开源中文分词工具包,中文分词Web API,Lucene中文分词,中英文混合分词
☆43Updated 4 years ago
Alternatives and similar repositories for mahjong:
Users that are interested in mahjong are comparing it to the libraries listed below
- 文本去重算法,研究自推荐系统中新闻的去重,采用了雅虎的Near-duplicates and shingling算法,服务端用c实现,客户 端用java实现,利用thrift框架进行通信,为了提高扩展性,去重可以在服务端实现,服务器也提供了计算的接口,方便客户端自己扩展☆23Updated 11 years ago
- 基于Akka开发的爬虫服务,非阻塞、高并发、实时☆26Updated 9 years ago
- 本项目转移到https://github.com/cocolian/cocolian-nlp☆34Updated 10 years ago
- LASER-A Scalable Response Prediction Platform For Online Advertising☆48Updated 10 years ago
- 中文自然语言处理工具包☆86Updated 9 years ago
- Predictive analatics using deepLearning4j and Spark☆26Updated 8 years ago
- ☆10Updated 9 years ago
- 基于Spark的LambdaMART实现☆11Updated 10 years ago
- tyccl(同义词词林) is a ruby gem that provides friendly functions to analyse similarity between Chinese Words.☆46Updated 11 years ago
- 复旦的中文自然语言工具包☆72Updated 7 years ago
- 使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引 擎,在Spark on Yarn测试通过☆29Updated 7 years ago
- ☆10Updated 9 years ago
- An implementation of the multi-class/multi-label classifier, of which the training is carried out using AdaBoost.MH on Apache Spark.☆107Updated 10 years ago
- LDA 的java实现☆62Updated 9 years ago
- A fork of cascading patterns, but implemented for trident☆71Updated last year
- Item-Based Collaborative Filtering Spark Job (use cosin similarity)☆37Updated 8 years ago
- Three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao☆26Updated 9 years ago
- Chinese Tokenizer; New words Finder. 中文三段式机械分词算法; 未登录新词发现算法☆95Updated 8 years ago
- A Chinese Words Segmentation Tool Based on Bayes Model☆81Updated 11 years ago
- 一个分布式的高性能Word2Vec实现☆15Updated 9 years ago
- Graph algorithms implemented in GraphX and Spark styles☆15Updated 9 years ago
- tag doc using topN words with lda☆10Updated 9 years ago
- stan-cn-nlp: an API wrapper based on Stanford NLP packages for the convenience of Chinese users☆57Updated 8 years ago
- Ytk-mp4j is a fast, user-friendly, cross-platform, multi-process, multi-thread collective message passing java library which includes gat…☆107Updated 7 years ago
- yet another segement☆21Updated 11 years ago
- Spark MLlib code optimized to efficiently support sparse data☆51Updated 8 years ago
- Open Source Simple Web Crawler for Java. Simple Flexible And Lightweight☆29Updated 2 years ago
- An interface of mllib and ml algorithms implemented by jddata with spark☆23Updated 10 years ago
- An iterative computing framework for both Hadoop MapReduce and Hadoop YARN.☆71Updated 2 years ago
- Stanford CoreNLP: A Java suite of core NLP tools.☆8Updated 8 years ago