中文文本分析工具、语料、预训练模型相关资源汇总。
☆144Sep 12, 2025Updated 6 months ago
Alternatives and similar repositories for Chinese-Pretrained-Word-Embeddings
Users that are interested in Chinese-Pretrained-Word-Embeddings are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 中文情感分析库(Chinese Sentiment))可对文本进行情绪分析、正负情感分析。Text analysis, supporting multiple methods including word count, readability, document simil…☆583Dec 9, 2022Updated 3 years ago
- 涵盖网络爬虫、数据库、数据分析、机器学习、可视化、文本分析、GUI、自动化办公☆13Jan 14, 2022Updated 4 years ago
- cntext 是一个专为社会科学实证研究设计的中文文本分析 Python 库。它不仅提供传统的词频统计和情感分析,还支持词嵌入训练、语义投影计算等高级功能,帮助研究者从大规模非结构化文本中测量抽象构念——如态度、认知、文化观念与心理状态。☆435Nov 21, 2025Updated 4 months ago
- ☆10May 20, 2024Updated last year
- 中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)☆732Oct 3, 2023Updated 2 years ago
- 计算两文档间文本相似性指标☆13May 5, 2023Updated 2 years ago
- ☆11Apr 4, 2018Updated 7 years ago
- 文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法☆2,603May 13, 2024Updated last year
- 基于TF-IDF和余弦定理计算文本相似度☆36Aug 29, 2018Updated 7 years ago
- Materials for "Language Models for Law and Social Science" (ETH Zurich), Spring 2024☆28May 6, 2024Updated last year
- 使用python抓取微博数据并对微博文本分析和可视化,LDA(树图)、关系图、词云、时间趋势(折线图)、热度地图、词典情感分析(饼图和3D柱状图)、词向量神经网络情感分析、tfidf聚类、词向量聚类、关键词提取、文本相似度分析等☆949Aug 28, 2020Updated 5 years ago
- AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and sem…☆38Mar 30, 2024Updated last year
- Applied BERT based model to extract relations from 29 annual reports of listed companies and news; Used spaCy library and BERT model for …☆13Feb 2, 2022Updated 4 years ago
- 使用SO_PMI互信息算法、词向量法快速构建不同领域(手机、汽车等)的专业情感词典☆94Nov 16, 2021Updated 4 years ago
- Spark—Python学习笔记☆11Sep 25, 2018Updated 7 years ago
- ☆12Apr 24, 2024Updated last year
- Answers to some "weird" statistics questions with R code☆10Jun 8, 2025Updated 9 months ago
- Aligned bilingual word vectors for English and Chinese☆11Jun 25, 2018Updated 7 years ago
- 针对巨潮资讯网上市公司公告的分布式爬虫,采用scrapy和kafka的分布式架构。可以爬取爬取指定上市公司列表、指定时间段内的所有公告并保存PDF。后续会加入搜索引擎功能☆19Oct 24, 2019Updated 6 years ago
- 文本分类-文本挖掘-情感分析-文本生成实战☆14Mar 22, 2023Updated 3 years ago
- 2019年个税新政策,起征点提升为5000后,年累进制算法一次性计算当年所有月份工资☆11Sep 15, 2022Updated 3 years ago
- ☆11Nov 27, 2018Updated 7 years ago
- 京东/淘宝客服对话数据公开,seq2seq生成模型设计对话系统获第二名☆44Dec 8, 2022Updated 3 years ago
- 基于CEC语料库挖掘要素识别规则,对新闻报道类生语料进行自动标注☆20May 14, 2015Updated 10 years ago
- Labs for USC's COMM 557: Data Science for Communication & Social Networks taught by Emilio Ferrara during the Fall 2020 semester.☆11Apr 19, 2022Updated 3 years ago
- 微博评论获取(API) 情感分析☆13Jan 23, 2020Updated 6 years ago
- 人工智能大作业:关于计算文本相似度的深度神经网络模型与算法研究分析(BERT、SentenceBERT、SimCSE)☆17Jul 11, 2022Updated 3 years ago
- Chinese Sentiment Analysis 中文文本情感分析☆192Mar 10, 2026Updated last week
- 近年来不时出现上市公司财务数据造假及暴雷的情况。面对上市公司多年的财务数据报告,筛选数据指标进行跟踪分析和研究,识别真伪,避免投资踩雷🤣。谁造假谁是是是🐱🐉😒☆11Jun 30, 2022Updated 3 years ago
- Spatial tile cache that saves its data into the IndexedDB of your browser☆14Jun 1, 2023Updated 2 years ago
- 大型中文道德句数据集CMOS☆10Apr 11, 2022Updated 3 years ago
- Estimate the power of linear mixed model☆15Oct 14, 2020Updated 5 years ago
- 中文情感分析,CNN,BI-LSTM,文本分类☆1,080Oct 22, 2022Updated 3 years ago
- Python package to calculate Boilerplate and many other text quantified features☆20Feb 1, 2025Updated last year
- Python3 实现的文章余弦相似度计算☆10Sep 28, 2017Updated 8 years ago
- ☆11Feb 26, 2023Updated 3 years ago
- 中文姓名与性别的相关性分析☆13May 16, 2016Updated 9 years ago
- extract key info from chinese email by using CRF and HMM☆14Apr 22, 2019Updated 6 years ago
- 基于XGB的上市公司财务造假预测☆14Jun 5, 2021Updated 4 years ago