xieyan0811 / pdfconv
中文PDF转TXT的实用工具
☆30Updated 3 years ago
Alternatives and similar repositories for pdfconv:
Users that are interested in pdfconv are comparing it to the libraries listed below
- 金庸小说人物关系图谱构建☆62Updated 5 years ago
- 百度百科爬虫☆33Updated 5 years ago
- ZhidaoChatbot, a chatbot that can be an expert on the common questions like why,how,when,who,what based on the online question-answer web…☆42Updated 5 years ago
- 各大中文分词性能评测☆155Updated 6 years ago
- self complemented SpellCorrection based pinyin similairity, edit distance ,基于拼音相似度与编辑距离的查询纠错。☆79Updated 2 years ago
- Word similarity computation based on Tongyici Cilin☆116Updated 7 years ago
- Cognitive Inference,认知推理、常识知识库、常识推理与常识推理评估的系统项目,以现有国内外已有的常识知识库为研究对象,从常识知识库资源建设和常识推理测试评估两个方面出发进行整理,并结合自己近几年来在逻辑性推理知识库的构建、应用以及理论思考进行介绍。具体包括…☆122Updated 4 years ago
- 天池比赛作品整理。实现从pdf中提取出姓名、出生年月、性别、电话、最高学历、籍贯、落户市县、政治面貌、毕业院校、工作单位、工作内容、职务、项目名称、项目责任、学位、毕业时间、工作时间、项目时间共18个字段。☆113Updated 6 months ago
- 中国知网论文数据集,24000+篇论文信息。自然语言处理、信息管理、文本分类、文本摘要、关键词抽取、研究热点分析、数据挖掘、数据分析☆46Updated 2 months ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆130Updated 4 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆48Updated 2 years ago
- 电商评论观点挖掘☆38Updated 5 years ago
- ☆41Updated 4 years ago
- 百度百科爬虫☆71Updated 8 months ago
- chinese anti semantic word search interface based on dict crawled from online resources, ChineseAntiword,针对中文词语的反义词查询接口☆59Updated 6 years ago
- Sequential Event Experiment based on Travel note crawled from XieCheng,基于50W携程出行游记的采集与顺承事件图谱构建.☆181Updated 6 years ago
- 中文分词工具评估☆61Updated 2 years ago
- 中文多跳问答数据集☆73Updated 6 years ago
- 2020 “万创杯”中医药天池大数据竞赛——中药说明书实体识别挑战 复盘☆31Updated 4 years ago
- Sentence-Transformers Information Retrieval example on Chinese☆29Updated last year
- 用BERT在百度WebQA中文问答数据集上做阅读问答☆65Updated 4 years ago
- EventKGNELL, event knowlege graph never end learning system, a event-centric knowledge base search system,实时事理逻辑知识库终身学习系统项目和事件为核心的知识库搜索系统…☆71Updated 4 years ago
- Code for chinese error detection module, using n-gram and bi-lstm☆131Updated 5 years ago
- Tookit-Sihui, a tool of some common algorithm, AI文本混合科学计算器(calculator-sihui), 句子词频-逆文本频率(TF-IDF),搜索BM25, 前缀树搜索关键词(trietree), 模板匹配-递归函数(fu…☆24Updated 3 years ago
- 手动实现Elasticsearch的倒排索引以及BM25算法☆46Updated 6 years ago
- 2020智源-京东多模态对话(JDDC2020)第三名解决方案分享☆41Updated 4 years ago
- ChineseHumorSentiment, chinese humor sentiment mining including corpus build and mining nlp methods.中文文本幽默情绪计算项目,项目包括幽默文本语料库的构建,幽默计算模型,包括…☆116Updated 6 years ago
- NLP文本增强的两种方式:同义词替换(利用word2vec词表)和回译☆73Updated 3 years ago
- worddict crawler and transfer for sougpuinput wordict , 搜狗输入法词库抓取与格式转换☆25Updated 6 years ago
- 关键词抽取项目☆24Updated 4 years ago