CharyHong / StopwordsLinks

常用中文停用词表：包含百度停用词表、哈工大停用词表和四川大学机器智能实验室停用词表。还有整理过的英文停用词表以及其他语言的停用词表

☆149

Alternatives and similar repositories for Stopwords

Users that are interested in Stopwords are comparing it to the libraries listed below

Sorting:

hiDaDeng / cntext
cntext 是一个专为社会科学实证研究设计的中文文本分析 Python 库。它不仅提供传统的词频统计和情感分析，还支持词嵌入训练、语义投影计算等高级功能，帮助研究者从大规模非结构化文本中测量抽象构念——如态度、认知、文化观念与心理状态。
☆369Updated 2 weeks ago
lynn1885 / BERTopic-Tutorial
☆209Updated last year
kiwirafe / xiangsi
中文文本相似度计算器
☆160Updated last year
ppzhenghua / SentimentAnalysisDictionary
中文情感词典汇总（台湾大学NTUSD简体中文情感词典，清华大学李军中文褒贬义词典，知网Hownet情感词典等）
☆196Updated 6 months ago
S-T-Full-Text-Knowledge-Mining / ChpoBERT
☆14Updated last year
hellonlp / sentiment-analysis
情感分析、文本分类、词典、bayes、sentiment analysis、TextCNN、classification、tensorflow、BERT、CNN、text classification
☆499Updated 3 months ago
ydli-ai / CSL
[COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集
☆642Updated 2 years ago
Cyberbolt / Cemotion
A Chinese NLP library based on BERT for sentiment analysis and general-purpose Chinese word segmentation. | 基于 BERT 的中文 NLP 库，用于中文情感倾向分析、…
☆217Updated 2 months ago
murray-z / text_analysis_tools
中文文本分析工具包（包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取）
☆721Updated 2 years ago
rsanshierli / EasyBert
基于Pytorch的Bert应用，包括命名实体识别、情感分析、文本分类以及文本相似度等
☆803Updated 4 years ago
hiDaDeng / cnsenti
中文情感分析库(Chinese Sentiment))可对文本进行情绪分析、正负情感分析。Text analysis, supporting multiple methods including word count, readability, document simil…
☆566Updated 2 years ago
taishan1994 / awesome-chinese-ner
中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集，以及中文预训练模型、词向量、实体识别综述等。
☆741Updated 3 months ago
BrownSweater / BERT_SMP2020-EWECT
在SMP2020的微博情绪分类任务上，微调在中文预料上预训练的BERT模型，进行文本分类。
☆112Updated 3 years ago
hellonlp / classifier-multi-label
多标签文本分类，多标签分类，文本分类, multi-label, classifier, text classification, BERT, seq2seq，attention, multi-label-classification
☆792Updated 9 months ago
tomatoyou / LDA-topic-extractor
LDA主题模型 | 主题困惑度 | 多文本
☆18Updated 8 months ago
blmoistawinde / HarvestText
文本挖掘和预处理工具（文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等），无监督或弱监督方法
☆2,557Updated last year
catqaq / OpenTextClassification
OpenTextClassification is all you need for text classification! Open text classification for everyone, enjoy your NLP journey! 这可能是目前为止最全…
☆208Updated last year
hcd233 / fine-tuning-Bert-for-sentiment-analysis
基于bert-base-chinese微调的中文情感分析任务，在WeiboSenti100k 数据集上训练5个epoch并且收敛
☆37Updated 2 years ago
shibing624 / nlp-tutorial
自然语言处理（NLP）教程，包括：词向量，词法分析，预训练语言模型，文本分类，文本语义匹配，信息抽取，翻译，对话。
☆465Updated 3 years ago
HUSTAI / uie_pytorch
PaddleNLP UIE模型的PyTorch版实现
☆652Updated 2 years ago
shibing624 / similarities
Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包，支持亿级数据文搜文、文搜图、图搜图，python3开发，开箱即用。
☆873Updated 11 months ago
jasoncao11 / nlp-notebook
NLP 领域常见任务的实现，包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。
☆535Updated 2 years ago
lijqhs / text-classification-cn
中文文本分类实践，基于搜狗新闻语料库，采用传统机器学习方法以及预训练模型等方法
☆192Updated 4 years ago
thu-coai / COLDataset
The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection
☆288Updated 2 years ago
JackHCC / Chinese-Text-Classification-PyTorch
中文文本分类任务，基于PyTorch实现（TextCNN，TextRNN，FastText，TextRCNN，BiLSTM_Attention, DPCNN, Transformer，Bert，ERNIE），开箱即用！
☆400Updated 2 years ago
prnake / CialloCorpus
人民日报(1946-2024)、习近平系列重要讲话数据库、古诗文
☆68Updated 6 months ago
km1994 / AwesomeNLP
此项目完成了关于 NLP-Beginner：自然语言处理入门练习的所有任务（文本分类、信息抽取、知识图谱、机器翻译、问答系统、文本生成、Text-to-SQL、文本纠错、文本挖掘、知识蒸馏、模型加速、OCR、TTS、Prompt、embedding等），所有代码都经过测试…
☆210Updated last year
shibing624 / pytextclassifier
pytextclassifier is a toolkit for text classification. 文本分类，LR，Xgboost，TextCNN，FastText，TextRNN，BERT等分类模型实现，开箱即用。
☆514Updated last year
psgetit / Chinese_Text_Classification_Pytorch
中文：方便好用的文本分类模型训练加推理全公开！欢迎star后礼貌获取！大体上本项目采用ERINE3.0的base版本将文本转换为语义向量而后做特征进行分类，实测上限极高可以优化后在61分类任务中达到92%准确率。
☆49Updated last year
yongzhuo / Pytorch-NLU
中文文本分类、序列标注工具包（pytorch），支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Chinese text classification and sequence labeling toolk…
☆350Updated last year