ChineseDiachronicCorpus,中文历时语料库,横跨六十余年,包括腾讯历时新闻2000-2016,人民日报历时语料1946-2003,参考消息历时语料1957-2002。基于历时流通语料库,可用于历时语言变化计算、语言监测、社会文化变迁研究提供基础性的语料支持。
☆23Jan 10, 2021Updated 5 years ago
Alternatives and similar repositories for ChineseDiachronicCorpus
Users that are interested in ChineseDiachronicCorpus are comparing it to the libraries listed below
Sorting:
- 人民日报(1946-2024)、习近平系列重要讲话数据库、古诗文☆82Mar 23, 2025Updated 11 months ago
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- The code implementation of the paper Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks (A…☆13Jul 16, 2024Updated last year
- ☆12Dec 13, 2022Updated 3 years ago
- Latent Drichlet Allocation and Dynamic Topic Modeling☆10Aug 11, 2021Updated 4 years ago
- Automatic missing value imputation using random forests☆14Aug 19, 2015Updated 10 years ago
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- Dynamic Topic Modelling Tutorial Files☆13May 12, 2015Updated 10 years ago
- Supplementary code for "News Frame Analysis: An Inductive Mixed-method Computational Approach" http://dx.doi.org/10.1080/19312458.2019.16…☆15Nov 13, 2020Updated 5 years ago
- ☆16Apr 30, 2025Updated 10 months ago
- The code implementation of the paper CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Low Resource With Contrastive Learni…☆16Mar 26, 2024Updated last year
- Small tutorial on how you can use BERT for Topic Modeling☆18Jun 1, 2021Updated 4 years ago
- MiniGPT-4 :: Updated to Torch 2.0, simple setup, easier API, cut out training code☆15Jun 12, 2023Updated 2 years ago
- ☆18Jun 13, 2023Updated 2 years ago
- 使用开源的Bert-as-Service预训练生成文档特征向量,基于k-means对COVID-19文献聚类,t-SNE可视化数据,通过LDA为每个簇生成主题关键词,画Bokeh图实现按簇、关键词搜索和筛选数据。☆19Aug 3, 2020Updated 5 years ago
- 将word2vec训练生成的词向量和BERT生成的词向量进行可视化对比☆15Jun 29, 2020Updated 5 years ago
- Fake News Detection - Feature Extraction using Vectorization such as Count Vectorizer, TFIDF Vectorizer, Hash Vectorizer,. Then used an E…☆20Feb 21, 2020Updated 6 years ago
- This project aims to build upon existing MGTBench project, extending its functionalities with the option to import and evaluate the bench…☆21Nov 5, 2024Updated last year
- 知乎回答、专栏及评论数据全覆盖爬取☆17Mar 11, 2023Updated 2 years ago
- WordBias: Visualizing Intersectional Social biases encoded in Word Embeddings☆23Aug 18, 2025Updated 6 months ago
- ☆25Sep 16, 2025Updated 5 months ago
- 用java写的搜狐新闻爬虫☆14May 2, 2017Updated 8 years ago
- Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens☆28Feb 17, 2026Updated 2 weeks ago
- fastText vectors created from Hong Kong data.☆22Jul 7, 2020Updated 5 years ago
- It is a simple demo of chatDB workflow in dify.☆24Dec 7, 2024Updated last year
- 训练词向量☆22Sep 26, 2020Updated 5 years ago
- An implementation of the exponential random graph model☆27May 14, 2014Updated 11 years ago
- (NAACL 2024) Official code repository for Mixset.☆27Dec 4, 2024Updated last year
- This package consists of functionalities for dynamic topic modelling and its visualization☆26May 16, 2020Updated 5 years ago
- A Python script for scraping LIHKG☆32Mar 7, 2022Updated 4 years ago
- Cantonese segmentation tool 粵語分詞工具☆30Aug 22, 2020Updated 5 years ago
- 利用bert预训练模型生成句向量或词向量☆26Oct 29, 2020Updated 5 years ago
- AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and sem…☆38Mar 30, 2024Updated last year
- Spoken Cantonese from Hong Kong.☆30Nov 12, 2025Updated 3 months ago
- 轻量级知乎爬虫,支持问题、收藏夹和本月最热☆24Dec 19, 2018Updated 7 years ago
- [NeurIPS 2024 D&B] DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios☆46Dec 10, 2024Updated last year
- Aligned Neural Topic Model (ANTM) for Exploring Evolving Topics: a dynamic neural topic model that uses document embeddings (data2vec) to…☆37Nov 6, 2023Updated 2 years ago
- Granger Causality library in python☆38Nov 19, 2021Updated 4 years ago
- Paper list of dementia detection☆41Jun 19, 2025Updated 8 months ago