likaiguo / simhashpyLinks
使用simhash算法,快速索引和查询大量文本简历
☆21Updated 10 years ago
Alternatives and similar repositories for simhashpy
Users that are interested in simhashpy are comparing it to the libraries listed below
Sorting:
- 使用python实现了一个简单的trie树结构,可增加/查找/删除关键词,用于中文文本的关键词匹配、停用词删除等。☆64Updated 5 years ago
- AC自动机python的实现,并进行了优化。 主要修复了 查询不准确的问题。☆77Updated 4 years ago
- 基于ltp的简单评论观点抽取模块☆117Updated 7 years ago
- 中文命名实体识别(公司名称),Tensorflow 1.3 + Python3☆37Updated 8 years ago
- 用TF特征向量和simhash指纹计算中文文本的相似度☆217Updated 9 years ago
- Word similarity computation based on Tongyici Cilin☆121Updated 8 years ago
- AI Challenger 2018 Sentiment Analysis Baseline with fastText☆153Updated 7 years ago
- E-Commerce Sentiment Dict☆128Updated 7 years ago
- 新词发现☆66Updated 11 years ago
- 利用Doc2Vec计算文本相似度☆139Updated 7 years ago
- Event monitor based on online news corpus including event storyline and analysis,基于给定事件关键词,采集事件资讯,对事件进行挖掘和分析。☆153Updated 7 years ago
- WordMultiSenseDisambiguation, chinese multi-wordsense disambiguation based on online bake knowledge base and semantic embedding similarit…☆131Updated 7 years ago
- Time-NLP的Python3版本 中文时间表达识别☆90Updated 5 years ago
- Train Wikidata with word2vec for word embedding tasks☆123Updated 7 years ago
- SmoothNLP领域词汇示例 - 基于复旦公开新闻资讯库☆50Updated 5 years ago
- 新词发现 基于词频、凝聚系数和左右邻接信息熵☆122Updated 5 years ago
- 新词发现算法(NewWordDetection)☆92Updated 4 years ago
- Syntax and Ruler-Based Doc sentiment analysis 基于依存句法规则的篇章级情感分析demo☆107Updated 6 years ago
- Sentence Distance☆55Updated 7 years ago
- ☆57Updated 4 years ago
- Tookit-Sihui, a tool of some common algorithm, AI文本混合科学计算器(calculator-sihui), 句子词频-逆文本频率(TF-IDF),搜索BM25, 前缀树搜索关键词(trietree), 模板匹配-递归函数(fu…☆24Updated 4 years ago
- 【梳理】FDDC2018金融算法挑战赛02-A股上市公司公告信息抽取☆94Updated 7 years ago
- 2018atec蚂蚁金服NLP智能客服比赛 16th/2632☆111Updated 7 years ago
- 短文本聚类预处理模块 Short text cluster☆281Updated 6 years ago
- 根据自己搭的 LTP 服务器,实现:分词、词性标注、命名实体识别、依存句法分析、语义角色标、命名实体的抽取:人名,地名,机构名、三元组的抽取:主谓宾,动宾关系,介宾关系,(实体1,关系,实体2)☆143Updated 8 years ago
- 这是一个类,里面包含的有关文本相似度的常用的计算算法,例如,最长公共子序列,最短标记距离,TF-IDF等算法☆63Updated 8 years ago
- 新词发现算法与同义词挖掘☆27Updated 8 years ago
- WordForm,针对中文词语的笔画拆解,偏旁查询,拼音转换接口☆65Updated 7 years ago
- 对四种句子/文本相似度计算方法进行实验与比较☆291Updated 5 years ago
- A deep text classifiers library.☆37Updated 7 years ago