sharejing / TakinLinks
A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。
☆35Updated 3 years ago
Alternatives and similar repositories for Takin
Users that are interested in Takin are comparing it to the libraries listed below
Sorting:
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆96Updated 11 months ago
- 中文机器阅读理解数据集☆109Updated 4 years ago
- 各大文本摘要模型-中文文本可运行的解决方案☆69Updated 2 years ago
- LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)☆220Updated 6 months ago
- PERT: Pre-training BERT with Permuted Language Model☆367Updated 6 months ago
- The Corpus & Code for EMNLP 2022 paper "FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction" | FCGEC中文语法纠错语料及STG模型☆120Updated last year
- OpenTextClassification is all you need for text classification! Open text classification for everyone, enjoy your NLP journey! 这可能是目前为止最全…☆209Updated last year
- ChineseTextualInference project including chinese corpus build and inferecence model, 中文文本推断项目,包括88万文本蕴含中文文本蕴含数据集的翻译与构建,基于深度学习的文本蕴含判定模型构建…☆176Updated 7 years ago
- 文本自动摘要☆94Updated 2 years ago
- Unilm for Chinese Chitchat Robot.基于Unilm模型的夸夸式闲聊机器人项目。☆158Updated 5 years ago
- 文本智能校对大赛(Chinese Text Correction)的baseline☆67Updated 3 years ago
- 继续预训练中文bert☆31Updated 4 years ago
- benchmark of KgCLUE, with different models and methods☆28Updated 4 years ago
- 🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。☆116Updated last year
- BERT微调在机器翻译上的应用,哎哟,效果贼好。☆51Updated 4 years ago
- Modify Chinese text, modified on LaserTagger Model. 文本复述,基于lasertagger做中文文本数据增强。☆322Updated 2 years ago
- 一个基于预训练的句向量生成工具☆138Updated 2 years ago
- 中文、分词、词表、核心词典、事件词表、停用词、敏感词、问答、问答数据、知识图谱、文本语料。☆171Updated 4 years ago
- CCL 2022 汉语学习者文本纠错评测☆142Updated 3 years ago
- 基于向量召回的检索式对话系统解决方案,dense retrieval,FAQ……☆34Updated 4 years ago
- 中文自然语言推理数据集(A large-scale Chinese Nature language inference and Semantic similarity calculation Dataset)☆435Updated 5 years ago
- 基于bert进行中文文本纠错☆239Updated 2 years ago
- 端到端的长本文摘要模型(法研杯2020司法摘要赛道)☆398Updated last year
- A framework for cleaning Chinese dialog data☆274Updated 4 years ago
- ☆420Updated last year
- 中文文本摘要(text summarization)工具包, 抽取式中文文本摘要 Extractive text summary of Lead3、keyword、textrank、text teaser、word significance、LDA、LSI、NMF。(gra…☆419Updated last year
- pytorch中文语言模型预训练☆386Updated 5 years ago
- CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)☆258Updated 6 months ago
- This is the dataset for Chinese community medical question answering.☆111Updated 6 years ago
- 基于pytorch的百度UIE命名实体识别。☆57Updated 2 years ago