hiyoung123 / DuplicateRemoveLinks
基于simhash的文本去重算法
☆20Updated 4 years ago
Alternatives and similar repositories for DuplicateRemove
Users that are interested in DuplicateRemove are comparing it to the libraries listed below
Sorting:
- SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for…☆63Updated 3 years ago
- 基于seq2edit (Gector) 的中文文本纠错。☆29Updated 2 years ago
- 句子匹配模型,包括无监督的SimCSE、ESimCSE、PromptBERT,和有监督的SBERT、CoSENT。☆99Updated 2 years ago
- 中文bigbird预训练模型☆93Updated 3 years ago
- NLP实验:新词挖掘+预训练模型继续Pre-training☆47Updated last year
- 文本智能校对大赛(Chinese Text Correction)的baseline☆67Updated 2 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆49Updated 3 years ago
- 用BERT在百度WebQA中文问答数据集上做阅读问答☆65Updated 5 years ago
- 基于PaddleNLP开源的抽取式UIE进行医学命名实体识别(torch实现)☆43Updated 2 years ago
- ☆87Updated 3 years ago
- CTC2021-中文文本纠错大赛的SOTA方案及在线演示☆72Updated 2 years ago
- 基于 Tensorflow,仿 Scikit-Learn 设计的深度学习自然语言处理框架。支持 40 余种模型类,涵盖语言模型、文本分类、NER、MRC、知识蒸馏等各个领域☆115Updated 2 years ago
- 一个简单易用的 Python 模块,用于通过字符串来操作日期/时间。正则时间提取,字符串时间解析,字符串时间提取。中文时间提取,一句话里面提取时间☆75Updated last year
- This is the official code for paper titled "Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models".☆68Updated 4 years ago
- ☆128Updated 2 years ago
- bert_avg,bert_whitening,sbert,consert,simcse,esimcse 中文句向量表示☆16Updated 3 years ago
- 中文文本纠错模型,keras实现☆74Updated 4 years ago
- 基于Pytorch实现的中文文本分类脚手架,以及常用模型对比。☆18Updated 4 years ago
- 实验苏神的CoSENT的Torch实现☆32Updated 3 years ago
- 中文版unilm预训练模型☆83Updated 4 years ago
- 时间关键词正则提取以及标准化☆21Updated 3 years ago
- 对话改写介绍文章☆97Updated 2 years ago
- lasertagger-chinese;lasertagger中文学习案例,案例数据,注释,shell运行☆75Updated 2 years ago
- chinese version of longformer☆113Updated 4 years ago
- pytorch Efficient GlobalPointer☆56Updated 3 years ago
- using lear to do ner extraction☆29Updated 3 years ago
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆90Updated 5 months ago
- 百度2021年语言与智能技术竞赛机器阅读理解torch版baseline☆53Updated 4 years ago
- 中文纠错☆92Updated 3 years ago
- 时间抽取、解析、标准化工具☆53Updated 2 years ago