ChineseDiachronicCorpus,中文历时语料库,横跨六十余年,包括腾讯历时新闻2000-2016,人民日报历时语料1946-2003,参考消息历时语料1957-2002。基于历时流通语料库,可用于历时语言变化计算、语言监测、社会文化变迁研究提供基础性的语料支持。
☆24Jan 10, 2021Updated 5 years ago
Alternatives and similar repositories for ChineseDiachronicCorpus
Users that are interested in ChineseDiachronicCorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 人民日报(1946-2024)、习近平系列重要讲话数据库、古诗文☆91Mar 23, 2025Updated last year
- ☆56Jun 4, 2024Updated 2 years ago
- The code implementation of the paper Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks (A…☆13Jul 16, 2024Updated last year
- Repository for the CommonLit Ease of Readability Corpus☆24Apr 17, 2024Updated 2 years ago
- Auto Generate Speech Script From PPT - Based on ChatGPT☆15Sep 1, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Python3 实现的文章余弦相似度计算☆10Sep 28, 2017Updated 8 years ago
- The code implementation of the paper CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Low Resource With Contrastive Learni…☆17Mar 26, 2024Updated 2 years ago
- PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialog…☆28Oct 4, 2021Updated 4 years ago
- [IJCAI 2025] In-Context Meta LoRA Generation☆33Jul 29, 2025Updated 10 months ago
- MiniGPT-4 :: Updated to Torch 2.0, simple setup, easier API, cut out training code☆15Jun 12, 2023Updated 3 years ago
- fastText vectors created from Hong Kong data.☆22Jul 7, 2020Updated 5 years ago
- 人民日报文章数据集(1949-1978)☆20Jul 9, 2020Updated 5 years ago
- Automatic missing value imputation using random forests☆14Aug 19, 2015Updated 10 years ago
- QuanSyn: A Python Package for Quantitative Syntax Analysis.☆38Apr 7, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- telegram 监控机器人,支持主动获取及消息订阅☆14May 30, 2020Updated 6 years ago
- Code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'☆18Apr 19, 2024Updated 2 years ago
- Dynamic Topic Modelling Tutorial Files☆14May 12, 2015Updated 11 years ago
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- Supplementary code for "News Frame Analysis: An Inductive Mixed-method Computational Approach" http://dx.doi.org/10.1080/19312458.2019.16…☆16Nov 13, 2020Updated 5 years ago
- (NAACL 2024) Official code repository for Mixset.☆26Dec 4, 2024Updated last year
- Latent Drichlet Allocation and Dynamic Topic Modeling☆10Aug 11, 2021Updated 4 years ago
- 基于人工智能 把 pdf 转 txt(pdf 文字识别)☆19Aug 8, 2022Updated 3 years ago
- Demo for the calculation of the Semantic Brand Score (Basic Version)☆13Sep 1, 2020Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 用java写的搜狐新闻爬虫☆14May 2, 2017Updated 9 years ago
- SeqXGPT: An advance method for sentence-level AI-generated text detection.☆100Oct 16, 2023Updated 2 years ago
- 电力数据预测分析☆18Aug 17, 2020Updated 5 years ago
- Small tutorial on how you can use BERT for Topic Modeling☆18Jun 1, 2021Updated 5 years ago
- Code for ACL 2024 long paper: Are AI-Generated Text Detectors Robust to Adversarial Perturbations?☆33Jul 12, 2024Updated last year
- Cantonese segmentation tool 粵語分詞工具☆31Aug 22, 2020Updated 5 years ago
- An automated data pipeline scaling RL to pretraining levels☆77Jun 2, 2026Updated 2 weeks ago
- Spoken Cantonese from Hong Kong.☆30May 6, 2026Updated last month
- ☆27Jun 5, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 基于T5模型的中文文本纠错☆34Nov 3, 2024Updated last year
- BERT&RoBERTa预训练代码,tensorflow和torch两种版本实现☆13Feb 8, 2023Updated 3 years ago
- ☆16Apr 30, 2025Updated last year
- 使用开源的Bert-as-Service预训练生成文档特征向量,基于k-means对COVID-19文献聚类,t-SNE可视化数据,通过LDA为每个簇生成主题关键词,画Bokeh图实现按簇、关键词搜索和筛选数据。☆19Aug 3, 2020Updated 5 years ago
- ☆15Jul 26, 2022Updated 3 years ago
- 使用 Jekyll 和 GitHub Actions 快速在 home.ustc.edu.cn 上部署一个漂亮的个人主页☆15Sep 15, 2022Updated 3 years ago
- 将word2vec训练生成的词向量和BERT生成的词向量进行可视化对比☆15Jun 29, 2020Updated 5 years ago