jannson / simhash-py
Simhash and near-duplicate detection
☆17Updated 11 years ago
Alternatives and similar repositories for simhash-py:
Users that are interested in simhash-py are comparing it to the libraries listed below
- A Slot-filling based Dialog Manager for Task-oriented Bot☆11Updated 8 years ago
- A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries☆12Updated 4 years ago
- A Simpler GloVe model for distributed word representation☆86Updated 3 years ago
- Dilation Gate CNN For Machine Reading Comprehension☆17Updated 2 years ago
- bert sentiment analysis tensorflow serving with RESTful API☆33Updated 6 years ago
- Deep structured semantic model☆32Updated 8 years ago
- 高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task☆58Updated 4 years ago
- The code of sequence to sequence learning for detection gramatical error in Chinese☆12Updated 7 years ago
- 这是一个tensorflow使用的样例,改自于https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html☆36Updated 7 years ago
- Supervised Latent Dirichlet Allocation for Classification☆85Updated 4 years ago
- lattice lstm cell implementation with tensorflow☆30Updated 6 years ago
- 新词发现☆66Updated 10 years ago
- This code is for Convolutional Latent Semantic Model, which is similay with DSSM(Deep Semantic Similarity Model).☆25Updated 9 years ago
- ☆59Updated 5 years ago
- A deep text classifiers library.☆37Updated 6 years ago
- Tools used to do Chinese Word Segmentation☆22Updated 11 years ago
- codes for ai challenger 2018 machine reading comprehension☆27Updated 6 years ago
- Chinese Tokenizer; New words Finder. 中文三段式机械分词算法; 未登录新词发现算法☆95Updated 8 years ago
- python CRF++实现分词☆37Updated 6 years ago
- Sub-Character Representation Learning☆25Updated 6 years ago
- Clone of "A Good Part-of-Speech Tagger in about 200 Lines of Python" by Matthew Honnibal☆48Updated 8 years ago
- ☆14Updated 8 years ago
- 新词发现算法(NewWordDetection)☆62Updated 7 years ago
- Details of paper cw2vec☆82Updated 6 years ago
- Dataset for CIKM 2018 paper "Multi-Source Pointer Network for Product Title Summarization"☆74Updated 6 years ago
- SegPhrase working on Chinese and Arabic☆35Updated 8 years ago
- A HMM-like linear-chain CRF, used Tensorflow API.☆36Updated 7 years ago
- kenlm语言模型,并提供python的rest服务☆29Updated 6 years ago
- this is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large☆65Updated 5 years ago
- modification of official bert for downstream task☆31Updated 2 years ago