This directory contains the training, test, and gold-standard data used in the 2nd International Chinese Word Segmentation Bakeoff. Also included is the script used to score the results submitted by the bakeoff participants and the simple segmenter used to generate the baseline and topline data.
☆67May 23, 2018Updated 7 years ago
Alternatives and similar repositories for icwb2-data
Users that are interested in icwb2-data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python NLP Reading Notebook By DUTIR Searh Engine Group☆18Sep 19, 2018Updated 7 years ago
- Tensorflow implementation of a Neural Attention Model for Abstractive Summarization.☆10Jul 20, 2020Updated 5 years ago
- Code for Unsupervised multi-granular Chinese word segmentation and term discovery via graph partition [JBI]☆16Jan 28, 2022Updated 4 years ago
- ☆19Jun 26, 2025Updated 9 months ago
- 🤗 HF Downloader (Hugging Face Downloader) 📦 A user-friendly GUI tool for downloading Hugging Face resources with enhanced connectivity…☆13Jan 5, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- HMM(隐马尔科夫)模型实现词性标注和分词☆10Sep 28, 2017Updated 8 years ago
- My third project in NLP classes.☆27Dec 8, 2017Updated 8 years ago
- 面向金融领域的小样本跨类迁移事件抽取 第三名 方案及代码☆17Dec 23, 2020Updated 5 years ago
- This procedure USES the model LSTM to train the data and predict the accusations☆10Jan 24, 2019Updated 7 years ago
- 采用bert进行事件抽取,[cls]进行事件分类,最后一层向量进行序列标注,两个任务同时训练。☆13Jun 7, 2021Updated 4 years ago
- ☆16Jun 19, 2020Updated 5 years ago
- Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation☆22Sep 18, 2020Updated 5 years ago
- Topic Detection and Tracking☆19Apr 21, 2015Updated 10 years ago
- A Challenge on Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG), Co-located with SLT2024 FutureDial-RAG Challenge☆11Aug 10, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for CascadeBERT, Findings of EMNLP 2021☆12Mar 30, 2022Updated 4 years ago
- A transformer seq2seq model to generate couplets. 一个写对联的 Transformer 序列到序列模型。☆17Feb 1, 2019Updated 7 years ago
- 依存句法关系之三元组提取方法示例☆12May 30, 2017Updated 8 years ago
- Simple Solution for Multi-Criteria Chinese Word Segmentation☆303Aug 12, 2020Updated 5 years ago
- Keyphrase Extraction from Scholarly Documents - Thesis☆14Nov 3, 2021Updated 4 years ago
- 🦜 NLP for Tibetan, in Python.☆39Apr 2, 2026Updated last week
- Chinese Word Segmentation task based on BERT and implemented in Pytorch☆14Aug 14, 2020Updated 5 years ago
- ☆96Nov 12, 2025Updated 5 months ago
- Prototype implementation of an architecture suggested in Robot Dream paper (http://arxiv.org/abs/1603.03007)☆12Jul 3, 2019Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 基于 Bi-LSTM 和 CRF 的中文语义角色标注☆88Jun 4, 2019Updated 6 years ago
- Build and visualize the word2vec model on sogou news data(SogouCS)☆13Mar 3, 2018Updated 8 years ago
- deploy sentiment classification model based lstm on Tensorflow serving☆10Sep 13, 2018Updated 7 years ago
- ☆17Nov 23, 2021Updated 4 years ago
- python CRF++实现分词☆37Jun 19, 2018Updated 7 years ago
- ☆23Mar 9, 2023Updated 3 years ago
- This package includes some extra functions to matplotlib.☆11May 10, 2022Updated 3 years ago
- 文本分类基准测试☆25Mar 29, 2018Updated 8 years ago
- 爬取中国所有省份办公厅公文数据。Crawler for all Policy text of all provinces in China☆22Dec 27, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 基于苏剑林项目的复用,应用于金融事件关系抽取☆11Mar 26, 2021Updated 5 years ago
- 基于LDA和TextRank的关键子提取算法实现☆23Aug 11, 2017Updated 8 years ago
- Graph Based Multi-sentences Compression Algorithm.☆31Oct 15, 2017Updated 8 years ago
- codes and pre-trained models of paper "Segatron: Segment-aware Transformer for Language Modeling and Understanding"☆18Oct 25, 2022Updated 3 years ago
- 本项目使用Keras实现Transformer模型来进行文本分类(中文、英文均支持)。☆12Mar 31, 2022Updated 4 years ago
- Python 6,113 Updated 9 days ago MLiA_SourceCode 机器学习实战----十大经典算法☆13Jan 14, 2019Updated 7 years ago
- BMInf demos.☆16Oct 14, 2021Updated 4 years ago