RuilinXu / GovDoc-CNLinks
A Multi-Modal Dataset of Chinese Governmental Docunments
☆38Updated 4 years ago
Alternatives and similar repositories for GovDoc-CN
Users that are interested in GovDoc-CN are comparing it to the libraries listed below
Sorting:
- "桃李“: 国际中文教育大模型☆185Updated last year
- 基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】☆37Updated last year
- 中文世界的NLP自动标注开源工具,简单样本,交给LabelFast。☆80Updated 9 months ago
- SearchGPT: Building a quick conversation-based search engine with LLMs.☆46Updated 9 months ago
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆312Updated last year
- llama信息抽取实战☆100Updated 2 years ago
- SMP 2023 ChatGLM金融大模型挑战赛 60 分baseline思路介绍☆186Updated 2 years ago
- pke_zh, python keyphrase extraction for chinese(zh). 中文关键词或关键句提取工具,实现了KeyBert、PositionRank、TopicRank、TextRank等算法,开箱即用。☆207Updated last year
- 中文原生检索增强生成测评基准☆123Updated last year
- 利用LLM+敏感词库,来自动判别是否涉及敏感词。☆132Updated 2 years ago
- 中文纠错☆93Updated 3 years ago
- OpenTextClassification is all you need for text classification! Open text classification for everyone, enjoy your NLP journey! 这可能是目前为止最全…☆208Updated last year
- PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取☆211Updated 2 years ago
- LLM for NER☆81Updated last year
- 基于sentence transformers和chatglm实现的文档搜索工具☆157Updated 2 years ago
- 骆驼QA,中文大语言阅读理解模型。☆75Updated 2 years ago
- 中国知网论文数据集,24000+篇论文信息。自然语言处理、信息管理、文本分类、文本摘要、关键词抽取、研究热点分析、数据挖掘、数据分析☆53Updated 7 months ago
- 🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。☆116Updated last year
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆303Updated last year
- ☆67Updated last year
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆47Updated last year
- ☆23Updated 2 years ago
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆110Updated 2 years ago
- 国内首个全参数训练的法律大模型 HanFei-1.0 (韩非)☆124Updated 2 years ago
- 天池比赛作品整理。实现从pdf中提取出姓名、出生年月、性别、电话、最高学历、籍贯、落户市县、政治面貌、毕业院校、工作单位、工作内容、职务、项目名称、项目责任、学位、毕业时间、工作时间、项目时间共18个字段。☆116Updated last year
- 基于qlora对baichuan-7B大模型进行指令微调。☆23Updated 2 years ago
- deep training task☆30Updated 2 years ago
- ☆194Updated 8 months ago
- TechGPT: Technology-Oriented Generative Pretrained Transformer☆226Updated 2 years ago
- ☆22Updated 3 years ago