Pre-trained ELECTRA from Hong Kong data
☆29Jul 7, 2020Updated 5 years ago
Alternatives and similar repositories for electra-hongkongese
Users that are interested in electra-hongkongese are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Scraped reviews from OpenRice for sentiment analysis. Formatted to use with BERT.☆11Apr 9, 2020Updated 6 years ago
- Transformers for Cantonese☆58Oct 24, 2020Updated 5 years ago
- Zero-Shot Translation implemented by Transformer☆14Mar 24, 2023Updated 3 years ago
- Dictionary for Cantonese word segmentation☆39Jun 4, 2024Updated 2 years ago
- 粵文語料篩選器 Cantonese text filter☆43Feb 4, 2026Updated 4 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Cantonese Linguistics and NLP☆411May 26, 2026Updated last month
- ☆103Feb 1, 2024Updated 2 years ago
- A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP☆94Oct 17, 2021Updated 4 years ago
- A frequency lexicon for Hong Kong Cantonese☆25Aug 27, 2020Updated 5 years ago
- 粤语分词工具☆48Jul 29, 2018Updated 7 years ago
- Answers to some "weird" statistics questions with R code☆10Jun 8, 2025Updated last year
- ☆15Oct 9, 2021Updated 4 years ago
- A Package for Cantonese Tokenisation☆18Jun 17, 2021Updated 5 years ago
- CyberCan is a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts from discussion forums in Hong Ko…☆12Aug 24, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Hong Kong Cantonese Corpus of transcribed speech (spontaneous speech, radio programmes and a monologue).☆90Nov 3, 2025Updated 7 months ago
- ☆15Dec 2, 2014Updated 11 years ago
- 🏃 hosting nlp models in one line☆20May 8, 2024Updated 2 years ago
- a simple machine learning pipeline built using Apache AirFlow☆15Nov 22, 2022Updated 3 years ago
- 56 language, 1 model Multilingual ASR☆24Jul 25, 2021Updated 4 years ago
- A tool to convert IUPAC representations of glycans into SMILES strings.☆17Aug 19, 2025Updated 10 months ago
- An apa7 template for quarto/posit☆12Jan 25, 2023Updated 3 years ago
- Human Protein Atlas - Single Cell Classification 2nd place solution Dual Head pipeline☆13May 25, 2021Updated 5 years ago
- 命例☆10Sep 27, 2018Updated 7 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 🎧 Simple bash-script to automatically download the most recent podcasts from a list of rss-feeds and upload them to your Dropbox.☆10Nov 30, 2015Updated 10 years ago
- WordBias: Visualizing Intersectional Social biases encoded in Word Embeddings☆23Aug 18, 2025Updated 10 months ago
- 《现代汉语大词典》字词头☆29Dec 29, 2020Updated 5 years ago
- ☆23Oct 20, 2021Updated 4 years ago
- ☆10Aug 14, 2019Updated 6 years ago
- ☆14Jan 25, 2026Updated 5 months ago
- Pre-processing DBpedia datasets to load into Dgraph☆13Mar 6, 2022Updated 4 years ago
- “达观杯”长文本智能处理挑战赛。达观数据提供了一批长文本数据和分类信息,希望选手动用自己的智慧,结合当下最先进的NLP和人工智能技术,深入分析文本内在结构和语义信息,构建文本分类模型,实现精准分类。☆10Jul 20, 2018Updated 7 years ago
- TensorFlow: learn and practice☆11Aug 30, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 对抗训练在NLP中的应用☆14Nov 22, 2021Updated 4 years ago
- RDrop 的 torch版☆16Jul 15, 2021Updated 4 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Nov 7, 2021Updated 4 years ago
- ☆27Aug 14, 2025Updated 10 months ago
- Image clustering☆13Jan 22, 2022Updated 4 years ago
- 早期的计算机使用7位的ASCII编码,为了处理汉字,程序员设计了用于简体中文的GB2312和用于繁体中文的big5。 GB2312(1980年)一共收录了7445个字符,包括6763个汉字和682个其它符号。汉字区的内码范围高字节从B0-F7,低字节从A1-FE,占用的码…☆10Sep 10, 2017Updated 8 years ago
- A curated list of resources dedicated to word segmentation☆12Jan 9, 2019Updated 7 years ago