17zuoye / detdup
Detect duplicated items。内容排重框架。
☆11Updated 9 years ago
Alternatives and similar repositories for detdup:
Users that are interested in detdup are comparing it to the libraries listed below
- tyccl(同义词词林) is a ruby gem that provides friendly functions to analyse similarity between Chinese Words.☆46Updated 11 years ago
- 开源中文分词工具包,中文分词Web API,Lucene中文分词,中英文混合分词☆43Updated 4 years ago
- 中文自然语言处理工具包☆86Updated 9 years ago
- A Chinese Words Segmentation Tool Based on Bayes Model☆79Updated 11 years ago
- auto generate chinese words in huge text.☆91Updated 10 years ago
- My GitHub Hubot scripts.☆12Updated 9 years ago
- 一个碎片收藏管理的工具☆8Updated 7 years ago
- 搜狗输入法细胞词库解析☆15Updated 11 years ago
- Sentiment Analysis on Google's Chinese 1gram dataset☆15Updated 7 years ago
- A Python package for pullword.com☆86Updated 4 years ago
- A small tool generates html exactly like github with TOC support.☆63Updated 5 years ago
- NanGe - A Rule-based Chinese-English Machine Translation System☆20Updated 7 years ago
- Yet another Chinese word segmentation package based on character-based tagging heuristics and CRF algorithm☆245Updated 12 years ago
- tools for chinese word segmentation and pos tagging written in python☆38Updated 11 years ago
- A graphical view of the relationships between github users.☆38Updated 5 years ago
- Thank-you-follow-me Ha Ha Ha!☆42Updated 9 years ago
- A modern online judge engine, adding new problems without writing code☆20Updated 7 years ago
- Pure python NLP toolkit☆55Updated 9 years ago
- 复旦的中文自然语言工具包☆72Updated 7 years ago
- Read&Learn English Books Easily☆25Updated 8 years ago
- A Chinese Webpage Title Text Categorization Tool 中文网页标题分类工具(短文本分类) pure c/c++ version: https://github.com/MagnusBai/webpage_categorizati…☆20Updated 8 years ago
- 《基于行块分布函数的通用网页正文抽取》的Python实现方式☆30Updated 10 years ago
- A movie search using haystack and whoosh☆21Updated 11 years ago
- 已停止,迁移为 https://github.com/PyChina/blog☆10Updated 8 years ago
- Wiser is a simple search engine from “検索エンジン自作入門”(how to develop search engine).☆20Updated last month
- [译] Python 自然语言处理 中文第二版☆63Updated 6 years ago
- http://guidetodatamining.com☆49Updated 6 years ago
- Scrapy Spider for SinaFinance, FTChinese, CFI.☆22Updated 10 years ago
- Distributed text analysis suite based on Celery☆95Updated 2 years ago
- Detect duplicated items framework。内容排重框架。☆12Updated 9 years ago