17zuoye / detdup
Detect duplicated items。内容排重框架。
☆11Updated 9 years ago
Alternatives and similar repositories for detdup:
Users that are interested in detdup are comparing it to the libraries listed below
- tyccl(同义词词林) is a ruby gem that provides friendly functions to analyse similarity between Chinese Words.☆46Updated 10 years ago
- NanGe - A Rule-based Chinese-English Machine Translation System☆20Updated 7 years ago
- 中文自然语言处理工具包☆86Updated 9 years ago
- Yet another Chinese word segmentation package based on character-based tagging heuristics and CRF algorithm☆243Updated 11 years ago
- ☆68Updated 9 years ago
- A Chinese Words Segmentation Tool Based on Bayes Model☆78Updated 11 years ago
- Thank-you-follow-me Ha Ha Ha!☆42Updated 8 years ago
- A Python package for pullword.com☆83Updated 4 years ago
- 一个碎片收藏管理的工具☆8Updated 6 years ago
- Detect duplicated items framework。内容排重框架。☆12Updated 9 years ago
- tools for chinese word segmentation and pos tagging written in python☆38Updated 11 years ago
- Some articles written by Bao Jie☆0Updated 8 years ago
- 【CC-BY-4.0】2017年冬日,一場大火之後,北京開啟「安全隱患大排查、大清理、大整治專項行動」,藉機清退大量聚居在出租公寓、工業園區等地的外來務工人口。事發突然,波及數十個村鎮級單位,上百萬人。端傳媒與民間機構、學生志願者遠程合作,共同梳理出292條有效數據,整理出…☆26Updated 6 years ago
- sina weibo crawler☆46Updated 9 years ago
- My GitHub Hubot scripts.☆12Updated 9 years ago
- pyqt package for blizzard like ui☆13Updated 8 years ago
- Sentiment Analysis on Google's Chinese 1gram dataset☆15Updated 7 years ago
- Q群爆照 图片查找☆11Updated 8 years ago
- 《基于行块分布函数的通用网页正文抽取》的Python实现方式☆30Updated 10 years ago
- Distributed text analysis suite based on Celery☆95Updated 2 years ago
- Comparision analysis of words use between 1 to 80 chapters and 80 to 120 chapters of 《A Dream of Red Mansions》.☆76Updated 6 years ago
- Pure python NLP toolkit☆55Updated 8 years ago
- 搜狗输入法细胞词库解析☆14Updated 11 years ago
- A graphical view of the relationships between github users.☆38Updated 5 years ago