yohokuno / count-ngram
Count frequent n-gram from big data with limited memory.
☆59Updated 11 years ago
Alternatives and similar repositories for count-ngram:
Users that are interested in count-ngram are comparing it to the libraries listed below
- Code for the ACL-2015 paper "Accurate Linear-Time Chinese Word Segmentation via Embedding Matching"☆38Updated 9 years ago
- Code for Exploring Segment Representations for Neural Segmentation Models☆30Updated 8 years ago
- Deep reinforcement learning with TensorFlow☆47Updated 7 years ago
- Deep Learning for NLP resources☆17Updated 9 years ago
- ☆29Updated 9 years ago
- this is a high performance cuda porting of cbow model of word2vec☆17Updated 10 years ago
- Word segmentation using neural networks based on package https://github.com/SUTDNLP/LibN3L☆23Updated 9 years ago
- LibN3L: A light-weight neural network package for natural language☆82Updated 9 years ago
- tyccl(同义词词林) is a ruby gem that provides friendly functions to analyse similarity between Chinese Words.☆46Updated 11 years ago
- The experiment software underlying two papers published at ECIR-2015 and SEMEVAL-2015.☆37Updated 9 years ago
- Sentiment Analysis with Ensemble☆244Updated 8 years ago
- Parallelizing word2vec in shared and distributed memory☆190Updated 2 years ago
- Chinese Tokenizer; New words Finder. 中文三段式机械分词算法; 未登录新词发现算法☆95Updated 8 years ago
- Chinese Word Similarity Computation based on HowNet☆27Updated 7 years ago
- Distributed LDA, takes raw text as input and outputs topic word table.☆16Updated 8 years ago
- A C++ version GBDT tool. Very fast at single machine. No time to make a distribution version.☆22Updated 8 years ago
- Topical Word Embeddings☆55Updated 7 years ago
- Cache efficient implementation for Latent Dirichlet Allocation☆162Updated 6 years ago
- ☆70Updated 9 years ago
- Deep Character-Level Neural Machine Translation☆71Updated 8 years ago
- word2vec variations☆7Updated 7 years ago
- PLDA: Parallel Latent Dirichlet Allocation in C++☆85Updated last year
- auto generate chinese words in huge text.☆91Updated 10 years ago
- Recurrent Neural Networks(GRU) for character-level language models on Chinese, in Python/Theano☆63Updated 7 years ago
- A light-weight matrix factorization tool☆39Updated 7 years ago
- a chinese segment base on crf☆233Updated 6 years ago
- Some articles written by Bao JieUpdated 8 years ago
- 中文自然语言处理工具包☆86Updated 9 years ago
- This code is for Convolutional Latent Semantic Model, which is similay with DSSM(Deep Semantic Similarity Model).☆25Updated 9 years ago
- Character-Level language models☆77Updated 7 years ago