CyberCommy / baidu-wiki-500w
百度百科 500 万数据集
☆30Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for baidu-wiki-500w
- 百度QA100万数据集☆49Updated 11 months ago
- 中文新词发现算法PNW算法,可以识别任意长度的新词。☆15Updated last year
- 专业领域词库构建/中文新词发现/专业词库发现☆28Updated 4 years ago
- ZhidaoChatbot, a chatbot that can be an expert on the common questions like why,how,when,who,what based on the online question-answer web…☆42Updated 5 years ago
- 手动实现Elasticsearch的倒排索引以及BM25算法☆45Updated 5 years ago
- A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。☆29Updated 2 years ago
- 时间抽取、解析、标准化工具☆49Updated 2 years ago
- ☆37Updated 5 years ago
- The most complete Chinese dictionaries ever. 史上最全的中文分类词库,包含地理信息、电子游戏、工程应用、农林牧渔、人文科学、社会科学、生活百科、医学医药、艺术设计、娱乐休闲、运动休闲、自然科学等12大类的超级字典。☆70Updated 4 years ago
- 从jieba分词到BERT-wwm,一步步带你进入中文NLP的世界☆14Updated 2 years ago
- 医疗语料库。医疗机构名语料库。药品本位码。☆57Updated 7 months ago
- self complemented SpellCorrection based pinyin similairity, edit distance ,基于拼音相似度与编辑距离的查询纠错。☆79Updated 2 years ago
- 微调预训练语言模型(BERT、Roberta、XLBert等),用于计算两个文本之间的相似度(通过句子对分类任务转换),适用于中文文本☆90Updated 4 years ago
- 对dbpedia和百科采集而来的语料进行清洗,得到合适的三元组☆14Updated 7 years ago
- 基于向量召回的检索式对话系统解决方案,dense retrieval,FAQ……☆32Updated 3 years ago
- 🤖️ 聊天机器人——夫子的「自然语言理解」模块☆88Updated last year
- Sentence-Transformers Information Retrieval example on Chinese☆29Updated 9 months ago
- 京东/淘宝客服对话数据公开,seq2seq生成模型设计对话系统获第二名☆39Updated last year
- deep training task☆29Updated last year
- Large-scale exact string matching tool☆15Updated last week
- 一个简单易用的 Python 模块,用于通过字符串来操作日期/时间。正则时间提取,字符串时间解析,字符串时间提取。中文时间提取,一句话里面提取时间☆75Updated 4 months ago
- 图书名语料库。含部分电影、游戏名称。☆66Updated 7 months ago
- Translation model based on sequence to sequence model. 基于seq2seq模型的翻译模型demo☆17Updated 6 years ago
- 大规模中文语料☆38Updated 5 years ago
- worddict crawler and transfer for sougpuinput wordict , 搜狗输入法词库抓取与格式转换☆25Updated 6 years ago
- NLP tools, word segmentation, sentence segmentation, New-Word-Discovery,新词发现☆24Updated 9 months ago
- 中文心理问答数据集☆67Updated 4 years ago
- 中文文本改写☆19Updated 4 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆48Updated 2 years ago