selfcs / stop-and-sensitive-wordsLinks
停用词和敏感词库
☆17Updated 5 years ago
Alternatives and similar repositories for stop-and-sensitive-words
Users that are interested in stop-and-sensitive-words are comparing it to the libraries listed below
Sorting:
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆55Updated 2 years ago
- Tracking the hot Github repos and update daily 每天自动追踪Github热门项目☆50Updated last week
- aigc evals☆10Updated 2 years ago
- 百度QA100万数据集☆46Updated 2 years ago
- 百度百科爬虫☆33Updated 6 years ago
- A large high-quality corpus of Chinese synonyms 一个大型、高质量的中文同义词语料库。☆69Updated 4 years ago
- ☆23Updated 2 years ago
- moss chat finetuning☆51Updated last year
- 打造人人都会的NLP,开源不易,记得star哦☆101Updated 2 years ago
- 中文、分词、词表、核心词典、事件词表、停用词、敏感词、问答、问答数据、知识图谱、文本语料。☆171Updated 4 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆51Updated 3 years ago
- 大规模中文语料☆44Updated 6 years ago
- Ziya-LLaMA-13B是IDEA基于LLaMa的130亿参数的大规模预训练模型,具备翻译,编程,文本分类,信息抽取,摘要,文案生成,常识问答和数学计算等能力。目前姜子牙通用大模型已完成大规模预训练、多任务有监督微调和人类反馈学习三阶段的训练过程。本文主要用于Ziya-…☆46Updated 2 years ago
- 文本智能校对大赛(Chinese Text Correction)的baseline☆66Updated 3 years ago
- 利用LLM+敏感词库,来自动判别是否涉及敏感词。☆136Updated 2 years ago
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆68Updated 2 years ago
- An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)☆108Updated 2 years ago
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆96Updated 11 months ago
- TensorRT☆11Updated 5 years ago
- 中文新词发现算法PNW算法,可以识别任意长度的新词。☆16Updated 2 years ago
- CCL 2023 汉语学习者文本纠错评测☆30Updated 2 years ago
- NLU & NLG (zero-shot) depend on mengzi-t5-base-mt pretrained model☆76Updated 3 years ago
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆49Updated 2 years ago
- 实现一种多Lora权值集成切换+Zero-Finetune零微调增强的跨模型技术方案,LLM-Base+LLM-X+Alpaca,初期,LLM-Base为Chatglm6B底座模型,LLM-X是LLAMA增强模型。该方案简易高效, 目标是使此类语言模型能够低能耗广泛部署,并最…☆116Updated 2 years ago
- 用于微调LLM的中文指令数据集☆28Updated 2 years ago
- 中国知网论文数据集,24000+篇论文信息。自然语言处理、信息管理、文本分类、文本摘要、关键词抽取、研究热点分析、数据挖掘、数据分析☆53Updated 11 months ago
- SuperCLUE琅琊榜:中文通用大模型匿名对战评价基准☆145Updated last year
- GOAT(山羊)是中英文大语言模型,基于LlaMa进行SFT。☆12Updated 2 years ago
- 大语言模型训练和服务调研☆37Updated 2 years ago
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆15Updated 2 years ago