Modify Chinese text, modified on LaserTagger Model. 文本复述,基于lasertagger做中文文本数据增强。
☆322Jan 3, 2024Updated 2 years ago
Alternatives and similar repositories for text_data_enhancement_with_LaserTagger
Users that are interested in text_data_enhancement_with_LaserTagger are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Modify Chinese text, modified on LaserTagger Model. I name it "文本手术刀".目前,本项目实现了一个文本复述任务,用于NLP语料的数据增强。☆215Mar 24, 2023Updated 3 years ago
- lasertagger-chinese;lasertagger中文学习案例,案例数据,注释,shell运行☆76Mar 25, 2023Updated 3 years ago
- ☆605Mar 12, 2026Updated last month
- 一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda☆1,881Mar 18, 2025Updated last year
- Research on the Construction and Application of Paraphrase Parallel Corpus☆11Oct 26, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。☆1,384May 31, 2022Updated 3 years ago
- 基于bert进行中文文本纠错☆242Jun 12, 2023Updated 2 years ago
- PMI, 是互信息(NMI)中的一种特例, 而互信息,是源于信息论中的一个概念,主要用于衡量2个信号的关联程度.至于PMI,是在文本处理中,用于计算两个词语之间的关联程度.比起传统的相似 度计算, pmi的好处在于,从统计的角度发现词语共现的情况来分析出词语间是否存在语义相关…☆15Aug 24, 2020Updated 5 years ago
- 高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型☆816Jul 8, 2020Updated 5 years ago
- Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained La…☆432May 17, 2020Updated 5 years ago
- 天池 疫情相似句对判定大赛 线上第一名方案☆435Oct 17, 2020Updated 5 years ago
- Open Language Pre-trained Model Zoo☆1,006Nov 18, 2021Updated 4 years ago
- Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo☆3,104May 9, 2024Updated last year
- A PyTorch-based knowledge distillation toolkit for natural language processing☆1,696May 8, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)☆1,439Jul 15, 2025Updated 9 months ago
- Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard☆1,786Feb 18, 2023Updated 3 years ago
- Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)☆10,197Jul 15, 2025Updated 9 months ago
- a bert for retrieval and generation☆860Feb 26, 2021Updated 5 years ago
- Chineses-PPDB☆14Nov 23, 2020Updated 5 years ago
- A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型☆3,984Nov 21, 2022Updated 3 years ago
- Pre-Trained Chinese XLNet(中文XLNet预训练模型)☆1,649Jul 15, 2025Updated 9 months ago
- RoBERTa中文预训练模型: RoBERTa for Chinese☆2,783Jul 22, 2024Updated last year
- Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.☆3,157Jan 22, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典,包括34万抽象语义库、34万反义语义库、43万同义语义库等,可支持句子扩展、转写、事件抽象与泛化等多种应用场景。☆780Mar 17, 2023Updated 3 years ago
- 复盘所有NLP比赛的TOP方案,只关注NLP比赛,持续更新中!☆2,800Apr 4, 2026Updated last week
- 对ACL2020 FastBERT论文的复现,论文地址//arxiv.org/pdf/2004.02178.pdf☆192Dec 15, 2021Updated 4 years ago
- 基于Pytorch实现的中文文本分类脚手架,以及常用模型对比。☆18Apr 23, 2021Updated 4 years ago
- fastHan是基于fastNLP与pytorch实现的中文自然语言处理工具,像spacy一样调用方便。☆762Dec 9, 2023Updated 2 years ago
- Data Augmentation for NLP. NLP数据增强☆294Dec 10, 2020Updated 5 years ago
- 基于Pytorch的,中文语义相似度匹配模型(ABCNN、Albert、Bert、BIMPM、DecomposableAttention、DistilBert、ESIM、RE2、Roberta、SiaGRU、XlNet)☆798Mar 22, 2020Updated 6 years ago
- 以词为基本单位的中文BERT☆477Nov 18, 2021Updated 4 years ago
- ☆90Jun 20, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Data augmentation for NLP, presented at EMNLP 2019☆1,652Mar 19, 2023Updated 3 years ago
- DeepIE: Deep Learning for Information Extraction☆1,941Dec 9, 2022Updated 3 years ago
- 大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP☆9,885Feb 6, 2026Updated 2 months ago
- pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。☆6,427Jan 12, 2026Updated 3 months ago
- keras implement of transformers for humans☆5,420Nov 11, 2024Updated last year
- 自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,b…☆1,538Sep 23, 2021Updated 4 years ago
- Pattern-Exploiting Training在中文上的简单实验☆173Oct 10, 2020Updated 5 years ago