CyberCommy / baidu-wiki-500w
百度百科 500 万数据集
☆31Updated last year
Alternatives and similar repositories for baidu-wiki-500w:
Users that are interested in baidu-wiki-500w are comparing it to the libraries listed below
- 中 文新词发现算法PNW算法,可以识别任意长度的新词。☆15Updated last year
- 百度QA100万数据集☆48Updated last year
- 手动实现Elasticsearch的倒排索引以及BM25算法☆46Updated 6 years ago
- 中文纠错☆91Updated 2 years ago
- 时间抽取、解析、标准化工具☆50Updated 2 years ago
- 基于sentence-transformers实现文本转向量的机器 人☆46Updated 2 years ago
- 专业领域词库构建/中文新词发现/专业词库发现☆29Updated 5 years ago
- 微调预训练语言模型(BERT、Roberta、XLBert等),用于计算两个文本之间的相似度(通过句子对分类任务转换),适用于中文文本☆90Updated 4 years ago
- RelExt: A Tool for Relation Extraction from Text. 文本实体关系抽取工具。☆48Updated 2 years ago
- 基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】☆29Updated 6 months ago
- Finetune baichuan pretrained model with QLora method☆15Updated last year
- clue chatyuan finetuning☆17Updated 8 months ago
- NLP tools, word segmentation, sentence segmentation, New-Word-Discovery,新词发现☆25Updated 11 months ago
- NLP 自然语言处理教程 https://dataxujing.github.io/NLP-paper/☆32Updated 3 years ago
- moss chat finetuning☆50Updated 8 months ago
- GoGPT:基于Llama/Llama 2训练的中英文增强大模型|Chinese-Llama2☆78Updated last year
- 一个基于预训练的句向量生成工具☆134Updated last year
- Datawhale自研数据标注工具☆66Updated 8 months ago
- ZhidaoChatbot, a chatbot that can be an expert on the common questions like why,how,when,who,what based on the online question-answer web…☆42Updated 5 years ago
- 一站式自动化开源标注平台☆66Updated 2 years ago
- A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。☆29Updated 2 years ago
- 基于向量召回的检索式对话系统解决方案,dense retrieval,FAQ……☆33Updated 3 years ago
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆54Updated last year
- 金庸小说人物关系图谱构建☆63Updated 5 years ago
- Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SO…☆52Updated 3 weeks ago
- baike schema crawler for baidu baike , hudongbaike. 面向百度百科与互动百科的概念分类体系抓取脚本☆32Updated 6 years ago
- CCKS 2022 通用信息抽取☆12Updated 2 years ago
- 对dbpedia和百科采集而来的语料进行清洗,得到合适的三元组☆14Updated 7 years ago
- 医疗语料库。医疗机构名语料库。药品本位码。☆63Updated 9 months ago