dalinvip/corpus_process_script

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dalinvip/corpus_process_script)

dalinvip / corpus_process_script

chinese and english corpus process script, python, c++, java

☆198

Alternatives and similar repositories for corpus_process_script

Users that are interested in corpus_process_script are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dalinvip / cw2vec
View on GitHub
cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information
☆274Mar 20, 2023Updated 3 years ago
dalinvip / Word_Similarity_and_Word_Analogy
View on GitHub
Word Similarity and Word Analogy Task scripts
☆71May 12, 2018Updated 8 years ago
Luka0612 / cw2vec
View on GitHub
基于字符训练词向量
☆90Jun 6, 2018Updated 8 years ago
zhang2010hao / cw2vec-pytorch
View on GitHub
This is a pytorch implement of cw2vec
☆31Jan 18, 2019Updated 7 years ago
noobiegz / cw2vec
View on GitHub
Implementation of the cw2vec model
☆29Jul 20, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HKUST-KnowComp / JWE
View on GitHub
Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components
☆100Jun 21, 2019Updated 7 years ago
deadshot465 / novelcrafter-mcp
View on GitHub
An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.
☆11Dec 3, 2024Updated last year
thunlp / SE-WRL-SAT
View on GitHub
Revised Version of SAT Model in "Improved Word Representation Learning with Sememes"
☆49Jul 30, 2020Updated 5 years ago
dalinvip / Awesome-Law-NLP-Research-Work
View on GitHub
Awesome Law NLP Research Work, Paper, Competition, Onlline System
☆415Mar 20, 2023Updated 3 years ago
mattzheng / ChineseWiki
View on GitHub
维基百科中文语料整理
☆304Mar 7, 2018Updated 8 years ago
WeblateOrg / hello
View on GitHub
Hello world demonstration for Weblate
☆15Jan 20, 2026Updated 6 months ago
Mleader2 / text_scalpel
View on GitHub
Modify Chinese text, modified on LaserTagger Model. I name it "文本手术刀".目前，本项目实现了一个文本复述任务，用于NLP语料的数据增强。
☆215Mar 24, 2023Updated 3 years ago
IAdmireu / ChineseSTS
View on GitHub
中文文本语义相似度（Chinese Semantic Text Similarity）语料库建设
☆478Mar 7, 2018Updated 8 years ago
vale-cli / SubVale
View on GitHub
A Sublime Text 3 client for Vale Server.
☆13Dec 7, 2020Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
esnme / landscape
View on GitHub
A Stylus-powered frontend CSS toolkit for building rich and beautiful web apps.
☆16Apr 2, 2012Updated 14 years ago
ProHiryu / bert-chinese-ner
View on GitHub
使用预训练语言模型BERT做中文NER
☆973Feb 26, 2020Updated 6 years ago
brightmart / nlp_chinese_corpus
View on GitHub
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
☆9,906Feb 6, 2026Updated 5 months ago
magesh-technovator / awesome-ai-applications
View on GitHub
A Comprehensive survey on business use cases of AI that help them thrive in the digital economy
☆13Oct 7, 2020Updated 5 years ago
google / arc-proselint
View on GitHub
A proselint linter for use with Phabricator's arc command line tool.
☆17Jun 17, 2016Updated 10 years ago
yangjinfeng / emrproject
View on GitHub
emr annoatation tool
☆19Oct 23, 2016Updated 9 years ago
DjagbleyEmmanuel / llamafile-convert_gguf_UI
View on GitHub
This GUI aims to simplify the process of converting GGUF files to llamafile format by providing an intuitive and convenient way for users…
☆14Jan 2, 2026Updated 6 months ago
csisc / OpenCitations-Bot
View on GitHub
A bot to add citation data from OpenCitations to Wikidata
☆12May 23, 2023Updated 3 years ago
EmbolismSoil / KNLP
View on GitHub
C++自然语言处理库
☆14Jan 22, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hltfbk / CROMER
View on GitHub
CROMER (CROss-document Main Events and entities Recognition), is a tool for cross-document coreference
☆12Jan 14, 2015Updated 11 years ago
zhaoyyoo / NLPCC2018_GEC
View on GitHub
The dataset and the evaluation tool for NLPCC2018 Shared Task2--Grammatical Error Correction (GEC).
☆55Mar 9, 2022Updated 4 years ago
jonathanwiesel / matterqus
View on GitHub
New disqus' comment notifier for Mattermost
☆10Nov 19, 2015Updated 10 years ago
iwater / node-stanford-corenlp
View on GitHub
A simple node.js wrapper for Stanford CoreNLP.
☆10Aug 7, 2014Updated 11 years ago
OpenMindClub / awesome-gpt-dev
View on GitHub
☆13May 10, 2023Updated 3 years ago
blcu-nlp / GEC-Reading-List
View on GitHub
A grammatical error correction reading list maintained by Beijing Language and Culture University Natural Language Processing Group
☆24Dec 22, 2020Updated 5 years ago
reactioncommerce / reaction-identity
View on GitHub
☆11Dec 10, 2022Updated 3 years ago
shelleyHLX / cail
View on GitHub
中国法研杯比赛
☆79Feb 4, 2021Updated 5 years ago
tonyyet / Tony-Labs
View on GitHub
一个纯实验项目
☆11Sep 13, 2011Updated 14 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ShelsonCao / cw2vec
View on GitHub
Details of paper cw2vec
☆82May 13, 2018Updated 8 years ago
ropensci-archive / alm
View on GitHub
ARCHIVED R Client for the Lagotto Altmetrics Platform
☆15May 10, 2022Updated 4 years ago
molovo / zlint
View on GitHub
A linter and code style checker for ZSH
☆20Feb 9, 2017Updated 9 years ago
Azhag / Bayesian-visual-working-memory
View on GitHub
Bayesian Visual Working Memory in Python.
☆13Mar 28, 2020Updated 6 years ago
lixinsu / RCZoo
View on GitHub
question answering, reading comprehension toolkit
☆164Oct 16, 2022Updated 3 years ago
CLUEbenchmark / CLUECorpus2020
View on GitHub
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
☆1,016Feb 6, 2026Updated 5 months ago
liuhuanyong / BaikeKnowledgeSchema
View on GitHub
baike schema crawler for baidu baike , hudongbaike. 面向百度百科与互动百科的概念分类体系抓取脚本
☆38Apr 25, 2018Updated 8 years ago