selfcs / stop-and-sensitive-words
停用词和敏感词库
☆16Updated 4 years ago
Alternatives and similar repositories for stop-and-sensitive-words:
Users that are interested in stop-and-sensitive-words are comparing it to the libraries listed below
- GOAT(山羊)是中英文大语言模型,基于LlaMa进行SFT。☆12Updated last year
- ☆23Updated last year
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆55Updated last year
- GoGPT中文指令数据集构造☆10Updated last year
- 中国知网论文数据集,24000+篇论文信息。自然语言处理、信息管理、文本分类、文本摘要、关键词抽取、研究热点分析、数据挖掘、数据分析☆48Updated 3 weeks ago
- 仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记【文本匹配篇】☆12Updated 2 years ago
- 大语言模型训练和服务调研☆37Updated last year
- aigc evals☆10Updated last year
- 公安网备 敏感词过滤词☆13Updated 6 years ago
- 大规模中文语料☆40Updated 5 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆50Updated 2 years ago
- ☆20Updated 3 years ago
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆14Updated last year
- 基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】☆30Updated 8 months ago
- 基于simhash的文本去重算法☆20Updated 3 years ago
- Tracking the hot Github repos and update daily 每天自动追踪Github热门项目☆47Updated this week
- 利用LLM+敏感词库,来自动判别是否涉及敏感词。☆112Updated last year
- CCL 2023 汉语学习者文本纠错评测☆28Updated last year
- NLP 自然语言处理教程 https://dataxujing.github.io/NLP-paper/☆32Updated 3 years ago
- ☆34Updated 3 years ago
- TensorRT☆11Updated 4 years ago
- 基于Pytorch实现的中文文本分类脚手架,以及常用模型对比。☆18Updated 3 years ago
- 文本去重☆69Updated 10 months ago
- Baselines for CCKS 2022 Task "Product Knowledge Graph Alignment"☆29Updated 2 years ago
- A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。☆32Updated 2 years ago
- 用于微调LLM的中文指令数据集☆27Updated last year
- 中文新词发现算法PNW算法,可以识别任意长度的新词。☆15Updated last year
- moss chat finetuning☆50Updated 11 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 11 months ago
- SuperCLUE-Role中文原生角色扮演测评基准☆30Updated 11 months ago