wjn1996 / scrapy_for_zh_wikiLinks
基于scrapy的层次优先队列方法爬取中文维基百科,并自动抽取结构和半结构数据
☆152Updated 2 years ago
Alternatives and similar repositories for scrapy_for_zh_wiki
Users that are interested in scrapy_for_zh_wiki are comparing it to the libraries listed below
Sorting:
- All NLP you Need Here. 目前包含15个NLP demo的pytorch实现(大量代码借鉴于其他开源项目,原先是自己玩的,后来干脆也开源出来)☆280Updated last week
- KgCLUE: 大规模中文开源知识图谱问答☆445Updated 3 years ago
- ☆39Updated 2 years ago
- 中文命名实体识别☆46Updated 3 years ago
- A PyTorch implementation of a BiLSTM \ BERT \ Roberta (+ BiLSTM + CRF) model for Chinese Word Segmentation (中文分词) .☆211Updated 2 years ago
- A PyTorch implementation of a BiLSTM\BERT\Roberta(+CRF) model for Named Entity Recognition.☆503Updated 4 years ago
- Implemention of NER model on chinese dataset.☆73Updated 2 years ago
- 基于pytorch_bert的中文多标签分类☆90Updated 3 years ago
- 使用多种方法做中文命名实体识别(NER),代码包含详细注释☆50Updated 2 years ago
- 记录经典NER模型,目前仓库包含如下模型代码:BERT, LSTM, GlobalPointer, CRF, HMM☆34Updated 2 years ago
- 基于pytorch+bert的中文文本分类☆85Updated 2 years ago
- Chinese-Text-Classification Project including bert-classification, textCNN and so on.☆159Updated 2 years ago
- 基于pytorch + bert的多标签文本分类(multi label text classification)☆106Updated last year
- 北京航空航天大学大数据高精尖中心自然语言处理研究团队对信息抽取领域的调研。包括实体识别,关系抽取,属性抽取等子任务,每类子任务分别对学术界和工业界进行调研。☆469Updated 3 years ago
- 利用指针网络进行信息抽取,包含命名实体识别、关系抽取、事件抽取。☆126Updated 2 years ago
- 中文NER的那些事儿☆318Updated last year
- 超长文本分类(大于1000字);文档级/篇章级文本分类;主要是解决长距离依赖问题☆130Updated 3 years ago
- ☆106Updated last year
- Using BERT+Bi-LSTM+CRF☆138Updated 3 years ago
- A tutorial and implement of disease centered Medical knowledge graph and qa system based on it。知识图谱构建,自动问答,基于kg的自动问答。以疾病为中心的一定规模医药领域知识图谱…☆70Updated 6 years ago
- 中文信息抽取,包含实体抽取、关系抽取、事件抽取☆247Updated last year
- 基于pytorch的GlobalPointer进行三元组抽取。☆80Updated 2 years ago
- 中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。☆716Updated last week
- 全局指针统一处理嵌套与非嵌套NER的Pytorch实现☆394Updated 2 years ago
- CMeIE/CBLUE/CHIP/实体关系抽取/SPO抽取☆231Updated 3 years ago
- SimCSE中文语义相似度对比学习模型☆86Updated 3 years ago
- 中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类 、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Chinese text classification and sequence labeling toolk…☆346Updated 11 months ago
- [COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集☆636Updated 2 years ago
- Unified Structure Generation for Universal Information Extraction☆932Updated 2 years ago
- NLP文本增强的两种方式:同义词替换(利用word2vec词表)和回译☆76Updated 4 years ago