大规模中文语料
☆44Nov 5, 2019Updated 6 years ago
Alternatives and similar repositories for C4-zh
Users that are interested in C4-zh are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 百度百科爬虫☆34Nov 3, 2019Updated 6 years ago
- 中文机器阅读理解数据集☆109Mar 29, 2021Updated 5 years ago
- this repo is mnbvc text quality classification using fastText☆16Oct 2, 2023Updated 2 years ago
- 中文自然语言推理数据集(A large-scale Chinese Nature language inference and Semantic similarity calculation Dataset)☆434Feb 10, 2020Updated 6 years ago
- Official code for "Automated Scoring for Reading Comprehension via In-context BERT Tuning" (AIED 2022)☆13May 23, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Non-autoregressive Translation by Learning Target Categorical Codes☆11Jul 11, 2021Updated 4 years ago
- This repository contains the code for the Transformer-Representation Neural Topic Model (TNTM) based on the paper "Probabilistic Topic Mo…☆12Jul 6, 2024Updated last year
- Chinese AMR Corpus☆39Apr 11, 2025Updated last year
- ☆16Jul 19, 2024Updated last year
- Explore what LLMs are really leanring over SFT☆28Mar 30, 2024Updated 2 years ago
- ☆13Oct 19, 2023Updated 2 years ago
- Syntax-aware Word Mover’s Distance for Sentence Similarity Modeling☆20Nov 6, 2023Updated 2 years ago
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆18Aug 28, 2023Updated 2 years ago
- LLM KV Cache compression - K+V dual compression, 73-99% VRAM savings, zero accuracy loss☆57Mar 30, 2026Updated 3 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆23Dec 31, 2020Updated 5 years ago
- Finding of ACL2023: Clustering-Aware Negative Sampling for Unsupervised Sentence Representation☆13Oct 16, 2023Updated 2 years ago
- 小样本学习的一些方法☆13Jul 28, 2019Updated 6 years ago
- bumble bee transformer☆14Apr 19, 2021Updated 5 years ago
- OCNLI: 中文原版自然语言推理任务☆167Sep 23, 2021Updated 4 years ago
- ☆219Dec 8, 2022Updated 3 years ago
- 中文机器阅读理解数据集☆65Jan 15, 2020Updated 6 years ago
- Visualization for hidden Markov model computations☆14Dec 19, 2014Updated 11 years ago
- ☆11Aug 2, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Open Source Simple Web Crawler for Java. Simple Flexible And Lightweight☆30Sep 1, 2022Updated 3 years ago
- ☆14Apr 6, 2014Updated 12 years ago
- 蚂蚁金融自然语言处理竞赛。☆10Sep 3, 2018Updated 7 years ago
- Discogs-VI dataset and code☆21Dec 13, 2024Updated last year
- 用java写的搜狐新闻爬虫☆14May 2, 2017Updated 9 years ago
- ☆21Sep 12, 2023Updated 2 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Aug 20, 2022Updated 3 years ago
- Hardware video encode/decode on the raspberry pi using the MMAL API☆32Oct 4, 2018Updated 7 years ago
- Info for prospective PhD students for Chris Donahue's lab at CMU starting Fall 23.☆12Nov 13, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Tool to create GPT disk image files☆12May 29, 2025Updated last year
- 词、句拼音转汉字、拼音分割、拼音补全、pygame输入中文☆15Mar 21, 2020Updated 6 years ago
- ☆25Apr 3, 2024Updated 2 years ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆59Apr 20, 2024Updated 2 years ago
- Code and data for the paper "Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems".☆14Aug 16, 2022Updated 3 years ago
- Train and filter data using Subcenter ArcFace model in Pytorch☆17Nov 16, 2021Updated 4 years ago
- Corpus creator for Chinese Wikipedia☆40Jun 30, 2021Updated 5 years ago