fondoger / scholar_dataset
百度百科学者词条、知网学者和中文论文元数据开源数据集
☆17Updated 4 years ago
Alternatives and similar repositories for scholar_dataset:
Users that are interested in scholar_dataset are comparing it to the libraries listed below
- Chinese Subjective Dectection based on subjective knowlegebase, 中文主观性计算。基于中文主观性知识库的句子主观性评定方法。☆57Updated last year
- 复现了论文《基于主题模型的短文本关键词抽取及扩展》的代码☆30Updated 4 years ago
- 提出基于划分的LDA主题模型 (PLDA)。对传统LDA模型进行改进,考虑中长篇文档篇章结构较为清晰,传统LDA在处理中长篇文档时不能识别每个篇章的主题,提出基于划分的LDA主题模型,对中长篇文档如新闻报道】国务院工作报告等按照段落进行划分,先拆后合,并将其效果与传统LDA…☆38Updated 5 years ago
- 利用bert预训练模型生成句向量或词向量☆28Updated 4 years ago
- Bert预训练模型fine-tune计算文本相似度☆100Updated last year
- ☆81Updated 6 years ago
- 基于TF-IDF和余弦定理计算文本相似度☆36Updated 6 years ago
- 使用开源的Bert-as-Service预训练生成文档特征向量,基于k-means对COVID-19文献聚类,t-SNE可视化数据,通过LDA为每个簇生成主题关键词,画Bokeh图实现按簇、关键词搜索和筛选数据。☆19Updated 4 years ago
- 领域自适应文本挖掘工具(新词发现、情感分析、实体链接等),基于少量种子词和背景知识☆13Updated 5 years ago
- 从英文文本中提取SAO结构脚本工具☆10Updated 9 years ago
- 使用SO_PMI互信息算法、词向量法快速构建不同领域(手机、汽车等)的专业情感词典☆89Updated 3 years ago
- Python实现中文文本关键词抽取,分别用了TF-IDF、LDA、RNN、LSTM和LR-SGD两类共五种方法,全网最全没有之 一。☆33Updated 4 years ago
- TF-IDF+Word2vec做文本相似度计算,最好是长文本☆24Updated 5 years ago
- 多标签文本分类☆53Updated 5 years ago
- 新闻文本自动摘要, 以Textrank 为基础,融入 标题特征,单句位置特征,重要实体特征,线索词特征,做句子的综合权重计算,并使用MMR算法,兼顾自动摘要的主题相关性和摘要多样性。☆25Updated 2 years ago
- pytorch implementation of multi-label text classification, includes kinds of models and pretrained. Especially for Chinese preprocessing.☆75Updated 5 years ago
- 根据褒贬种子词,利用SO-PMI构建情感词典☆25Updated 9 years ago
- Dataset from 'Character-based BiLSTM-CRF Incorporating POS and Dictionaries for Chinese Opinion Target Extraction'☆40Updated 6 years ago
- 文本摘要生成☆13Updated 2 years ago
- 毕业设计,基于事理图谱的事件推理系统☆67Updated 4 years ago
- Self complemented sentiment words expansion using seed sentiment words and so-pmi , this method is tested to be effective, 基于情感种子词与so-pmi…☆86Updated 6 years ago
- 利用Bert获取中文字、词向量☆10Updated 3 years ago
- 继续预训练中文bert☆30Updated 3 years ago
- 文本相似性☆23Updated 5 years ago
- 电商评论观点挖掘☆38Updated 5 years ago
- SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for…☆62Updated 3 years ago
- 文本聚类☆34Updated 3 years ago
- 大连理工大学情感词汇本体库及其他相关操作☆129Updated 7 years ago
- 无监督观点聚类。通过依存关系进行观点提取,对观点进行相似度计算,对已经生成的观点聚类☆47Updated 6 years ago
- multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search☆32Updated 2 years ago