shenfei1010 / CyberCan
CyberCan is a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts from discussion forums in Hong Kong.
☆12Updated 3 years ago
Alternatives and similar repositories for CyberCan:
Users that are interested in CyberCan are comparing it to the libraries listed below
- A Package for Cantonese Tokenisation☆17Updated 3 years ago
- Cantonese segmentation tool 粵語分詞工具☆30Updated 4 years ago
- Twitter dataset for 2022 Russian and Ukrainian crisis☆49Updated 2 years ago
- Chinese Moral Foundation Dictionary☆17Updated last year
- Chinese Dialect Database☆17Updated 7 years ago
- A frequency lexicon for Hong Kong Cantonese☆22Updated 4 years ago
- Pre-trained ELECTRA from Hong Kong data☆29Updated 4 years ago
- ☆47Updated 3 years ago
- BERT Tokenizer with vocabulary tailored for Cantonese☆21Updated 2 years ago
- This repository contains data of TikTok videos related to the 2024 U.S. Elections☆21Updated 2 months ago
- Loengfan (粵語兩分) is the Cantonese version of the Liang Fen input method☆12Updated 3 years ago
- The Extended Moral Foundations Dictionary (E-MFD)☆40Updated 4 years ago
- BirdSpotter is a python package which provides an influence and bot detection toolkit for twitter.☆19Updated 4 years ago
- ☆53Updated 2 years ago
- ☆22Updated last year
- fastText vectors created from Hong Kong data.☆21Updated 4 years ago
- ☆21Updated 4 years ago
- Learning from Neighbors: Unsupervised Text Classification☆17Updated 2 years ago
- The official Github for the American Stories dataset as in {link}☆117Updated last year
- A simple toolkit for conducting analyses using corpus methods☆25Updated 3 years ago
- Driver for LIWC2015 analysis. LIWC2015 dictionary not included.☆16Updated 2 years ago
- ☆10Updated last year
- R Scraper for LIHKG, the Hong Kong version of Reddit.☆16Updated 4 years ago
- Natural Language Processing for Political Science☆20Updated 7 years ago
- 漢語常用字詞表☆12Updated last year
- Additional material for the paper "MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction"☆54Updated 2 years ago
- 《香港二十世紀中期粵語語料庫》打包器☆16Updated 9 years ago
- Repository for the paper Us vs. Them: A Dataset of Populist Attitudes, News Bias and Emotions☆16Updated 10 months ago
- 粵語拼音轉換表☆33Updated 2 weeks ago
- 粵文語料篩選器 Cantonese text filter☆39Updated last month