CyberCommy / baidu-wiki-500w
百度百科 500 万数据集
☆34Updated last year
Alternatives and similar repositories for baidu-wiki-500w
Users that are interested in baidu-wiki-500w are comparing it to the libraries listed below
Sorting:
- 中文、分词、词表、核心词典、事件词表、停用词、敏感词、问答、问答数据、知识图谱、文本语料。☆162Updated 3 years ago
- 百度QA100万数据集☆47Updated last year
- 中文新词发现算法PNW算法,可以识别任意长度的新词。☆15Updated last year
- 手动实现Elasticsearch的倒排索引以及BM25算法☆47Updated 6 years ago
- 个人实现的基于Django与semantic-ui的语言计算实验平台, 功能包括自然语言综合处理,词语计算,社会热点计算,人物计算,文学画像,职位画像等社会计算功能☆29Updated 7 years ago
- 京东/淘宝客服对话数据公开,seq2seq生成模型设计对话系统获第二名☆43Updated 2 years ago
- worddict crawler and transfer for sougpuinput wordict , 搜狗输入法词库抓取与格式转换☆25Updated 7 years ago
- self complemented WeiboIndexSpyder based on Selenium ,新浪微博指数(微指数)采集,包括综合指数,移动端指数,PC端指数☆31Updated 6 years ago
- 医疗语料库。医疗机构名语料库。药品本位码。☆69Updated last year
- Datawhale自研数据标注工具☆68Updated last year
- 微调预训练语言模型(BERT、Roberta、XLBert等),用于计算两个文本之间的相似度(通过句子对分类任务转换),适用于中文文本☆89Updated 4 years ago
- A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。☆32Updated 2 years ago
- 使用Simhash对海量文本进行去重☆12Updated 6 years ago
- 专业领域词库构建/中文新词发现/专业词库发现☆29Updated 5 years ago
- NER实体识别模型,快速高效简单一键部署docker部署调用模型。能识别:地址、人名、机构名实体。☆36Updated last year
- ZhidaoChatbot, a chatbot that can be an expert on the common questions like why,how,when,who,what based on the online question-answer web…☆42Updated 6 years ago
- An exploration for Eventline (important news Rank organized by pulic time),针对某一事件话题下的新闻报道集合,通过使用docrank算法,对新闻报道进行重要性识别,并通过新闻报道时间挑选出时间线上重要…☆221Updated 6 years ago
- 基于火力发电厂知识问答库的检索式问答系统/问答系统/对话系统☆53Updated 4 years ago
- Sentence-Transformers Information Retrieval example on Chinese☆29Updated last year
- 🤖 聊天机器人示例,定制聊天机器人,聊天机器人语料导入导出☆125Updated 10 months ago
- 金庸小说人物关系图谱构建☆61Updated 5 years ago
- Translation model based on sequence to sequence model. 基于seq2seq模型的翻译模型demo☆17Updated 6 years ago
- Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。☆86Updated 7 years ago
- self complemented BaiduIndexSpyder based on Selenium , index image decode and num image transfer,基于关键词的历时百度搜索指数自动采集☆41Updated 6 years ago
- 电商多轮对话智能机器人☆58Updated 6 years ago
- 基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】☆31Updated 10 months ago
- ☆21Updated 3 years ago
- ☆37Updated 5 years ago
- 中文纠错☆92Updated 3 years ago
- 微博自动摘要系统 Chinese Microblog Automatic Summary System☆30Updated 6 years ago