Lafitte1573 / NLCorporaLinks
收集 NLP 领域的高质量中文数据集
☆52Updated 8 months ago
Alternatives and similar repositories for NLCorpora
Users that are interested in NLCorpora are comparing it to the libraries listed below
Sorting:
- 使用 Qwen2ForSequenceClassification 简单实现文本分类任务。☆89Updated last year
- CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)☆258Updated 5 months ago
- ☆119Updated last year
- CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset for Conversational AI | 中文个性情感对话数据集☆263Updated 3 years ago
- 本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料,该资料目前包含 自然语言处理各领域的 面试题积累。☆87Updated 4 years ago
- 此项目完成了关于 NLP-Beginner:自然语言处理入门练习 的所有任务(文本分类、信息抽取、知识图谱、机器翻译、问答系统、文本生成、Text-to-SQL、文本纠错、文本挖掘、知识蒸馏、模型加速、OCR、TTS、Prompt、embedding等),所有代码都经过测试…☆215Updated 2 years ago
- 用于汇总目前的开源中文对话数据集☆194Updated 2 years ago
- 基于T5模型的中文文本纠错☆33Updated last year
- 一个中文心理健康支持问答数据集,提供了丰富的援助策略标注。可用于生成富有援助策略的长咨询文本。☆240Updated last year
- Python ROUGE Score Implementation for Chinese Language Task (official rouge score)☆111Updated last year
- 活字通用大模型☆391Updated last year
- text correction papers☆313Updated last year
- The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection☆304Updated 2 years ago
- MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Gr…☆560Updated 2 years ago
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆96Updated 10 months ago
- Official github repo for ACLUE, an evaluation benchmark focused on ancient Chinese language comprehension☆32Updated last year
- A Chinese medical ChatGPT based on LLaMa, training from large-scale pretrain corpus and multi-turn dialogue dataset.☆386Updated 2 years ago
- 基于DPO算法微调语言大模型,简单好上手。☆48Updated last year
- ☆83Updated last year
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆314Updated last year
- PromptCBLUE: a large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain in Chinese☆386Updated last year
- LAiW: A Chinese Legal Large Language Models Benchmark☆85Updated last year
- Archive for AINLP History Article☆198Updated 4 years ago
- 阿里通义千问(Qwen-7B-Chat/Qwen-7B), 微调/LORA/推理☆132Updated last year
- LLM for NER☆80Updated last year
- MLNLP社区翻译的NLP入门课程。☆178Updated 2 years ago
- 《自然语言处理综论》第三版翻译。☆125Updated 2 years ago
- Source code for the paper "C-LLM: Learn to Check Chinese Spelling Errors Character by Character"☆29Updated last year
- Baichuan-13B 指令微调☆90Updated 2 years ago
- A Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark☆103Updated 2 years ago