hiyoung123 / DuplicateRemove
基于simhash的文本去重算法
☆19Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for DuplicateRemove
- 法研杯犯罪金额提取☆12Updated 2 years ago
- bert_avg,bert_whitening,sbert,consert,simcse,esimcse 中文句向量表示☆16Updated 2 years ago
- 实验苏神的CoSENT的Torch实现☆32Updated 2 years ago
- 时间关键词正则提取以及标准化☆21Updated 2 years ago
- ☆57Updated last year
- 基于seq2edit (Gector) 的中文文本纠错。☆26Updated 2 years ago
- 中文bigbird预训练模型☆89Updated 2 years ago
- NLP实验:新词挖掘+预训练模型继续Pre-training☆47Updated last year
- 基于Pytorch实现的中文文本分类脚手架,以及常用模型对比。☆18Updated 3 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆48Updated 2 years ago
- bert4keras文档(非官方)☆23Updated 2 years ago
- 基于PaddleNLP开源的抽取式UIE进行医学命名实体识别(torch实现)☆42Updated 2 years ago
- SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for…☆63Updated 3 years ago
- 格物-多语言和中文大规模预训练模型-轻量版,涵盖纯中文、知识增强、113个语种多语言,采用主流Roberta架构,适用于NLU和NLG任务, 支持pytorch、tensorflow、uer、huggingface等框架。 Multilingual and Chinese …☆26Updated 2 years ago
- 时间抽取、解析、标准化工具☆49Updated 2 years ago
- 句子匹配模型,包括无监督的SimCSE、ESimCSE、PromptBERT,和有监督的SBERT、CoSENT。☆97Updated 2 years ago
- 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)☆12Updated 4 years ago
- DescriptionPairsExtraction, entity and it's description pairs extract program based on Albert and data back-annotation. 基于Albert与结构化数据回标思…☆20Updated 2 years ago
- 千言多技能对话,包 含闲聊、知识对话、推荐对话☆27Updated 3 years ago
- ☆100Updated 4 years ago
- A concise implementation of SimCSE☆17Updated 3 years ago
- 用bert4keras加载CDial-GPT☆38Updated 4 years ago
- A simple implementation of Biaffine NER.☆35Updated 2 years ago
- ☆87Updated 3 years ago
- 基于simcse的中文句向量生成☆15Updated 2 years ago
- 有一个通用实体关系事件抽取的任务,需要使用到UIE模框架,而且需要将起部署到昇腾310服务器上,因为UIE模型底层使用的是ernie3.0,但是目前paddle官方还不支持ernie3.0模型在昇腾310上部署,所以才有了以下的操作,主要过程是,先试用paddle训练处模型…☆17Updated 2 years ago
- 基于pytorch的百度UIE命名实体识别。☆54Updated last year
- pytorch版基于gpt+nezha的中文多轮Cdial☆11Updated 2 years ago
- Chinese Grammatical Error Diagnosis☆11Updated 3 years ago