ChineseDiachronicCorpus,中文历时语料库,横跨六十余年,包括腾讯历时新闻2000-2016,人民日报历时语料1946-2003,参考消息历时语料1957-2002。基于历时流通语料库,可用于历时语言变化计算、语言监测、社会文化变迁研究提供基础性的语料支持。
☆23Jan 10, 2021Updated 5 years ago
Alternatives and similar repositories for ChineseDiachronicCorpus
Users that are interested in ChineseDiachronicCorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 人民日报(1946-2024)、习近平系列重要讲话数据库、古诗文☆85Mar 23, 2025Updated last year
- ☆55Jun 4, 2024Updated last year
- framework for data mining, and c++ language used.☆23Apr 2, 2013Updated 13 years ago
- Repository for the CommonLit Ease of Readability Corpus☆24Apr 17, 2024Updated 2 years ago
- Python3 实现的文章余弦相似度计算☆10Sep 28, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 2019年科大讯飞开发者大赛——阿尔茨海默综合症预测挑战赛冠军方案☆14Oct 27, 2019Updated 6 years ago
- 人民日报文章数据集(1949-1978)☆20Jul 9, 2020Updated 5 years ago
- Automatic missing value imputation using random forests☆14Aug 19, 2015Updated 10 years ago
- Dynamic Topic Modelling Tutorial Files☆14May 12, 2015Updated 10 years ago
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- Supplementary code for "News Frame Analysis: An Inductive Mixed-method Computational Approach" http://dx.doi.org/10.1080/19312458.2019.16…☆15Nov 13, 2020Updated 5 years ago
- (NAACL 2024) Official code repository for Mixset.☆26Dec 4, 2024Updated last year
- 为了更好地管理博客文章,分享更好的知识,该系列资源为作者CSDN博客的备份文件。本资源为作者Python数据挖掘课程系列,主要是作者《数据挖掘》、《大数据分析及技术》等课程分享的内容,涉及Python基础知识、网络爬虫、聚类、分类、回归、情感分析、可视化分析等知识,基础性文…☆23Mar 14, 2020Updated 6 years ago
- Demo for the calculation of the Semantic Brand Score (Basic Version)☆13Sep 1, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and sem…☆39Mar 30, 2024Updated 2 years ago
- SeqXGPT: An advance method for sentence-level AI-generated text detection.☆101Oct 16, 2023Updated 2 years ago
- Small tutorial on how you can use BERT for Topic Modeling☆18Jun 1, 2021Updated 4 years ago
- Cantonese segmentation tool 粵語分詞工具☆31Aug 22, 2020Updated 5 years ago
- Code for ACL 2024 long paper: Are AI-Generated Text Detectors Robust to Adversarial Perturbations?☆33Jul 12, 2024Updated last year
- BERT&RoBERTa预训练代码,tensorflow和torch两种版本实现☆13Feb 8, 2023Updated 3 years ago
- ☆17Jan 31, 2025Updated last year
- ☆16Apr 30, 2025Updated 11 months ago
- 使用开源的Bert-as-Service预训练生成文档特征向量,基于k-means对COVID-19文献聚类,t-SNE可视化数据,通过LDA为每个簇生成主题关键词,画Bokeh图实现按簇、关键词搜索和筛选数据。☆19Aug 3, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 将word2vec训练生成的词向量和BERT生成的词向量 进行可视化对比☆15Jun 29, 2020Updated 5 years ago
- a fast async pool based on channel☆26Jan 22, 2026Updated 2 months ago
- awk 完全参考手册☆22Jan 10, 2023Updated 3 years ago
- Nonlinear Granger causality using machine learning techniques☆22Sep 8, 2023Updated 2 years ago
- ☆19Jun 13, 2019Updated 6 years ago
- The first Chinese metaphor corpus serving for identification and generation. 中文比喻数据集. Presented at COLING 2022.☆46Jan 25, 2023Updated 3 years ago
- Fake News Detection - Feature Extraction using Vectorization such as Count Vectorizer, TFIDF Vectorizer, Hash Vectorizer,. Then used an E…☆21Feb 21, 2020Updated 6 years ago
- A modern fullstack starter kit powered by Next.js 16, Tailwind CSS v4, shadcn/ui, Prisma, and Supabase — perfect for building fast, scala…☆47Updated this week
- [NeurIPS 2024 D&B] DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios☆48Dec 10, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An implementation of the exponential random graph model☆28May 14, 2014Updated 11 years ago
- 知乎回答、专栏及评论数据全覆盖爬取☆17Mar 11, 2023Updated 3 years ago
- ☆22Feb 5, 2026Updated 2 months ago
- Implementation of Dynamic Embedding Topic Modeling on arxiv.org articles☆21Apr 24, 2022Updated 3 years ago
- 大规模中文语料☆44Nov 5, 2019Updated 6 years ago
- 轻量级知乎爬虫,支持问题、收藏夹和本月最热☆24Dec 19, 2018Updated 7 years ago
- ☆46Apr 19, 2024Updated last year