SkydustZ / AEC-domain-corporaLinks
The code and dataset for the paper "Pretrained Domain-Specific Language Model for General Information Retrieval Tasks in the AEC Domain"
☆24Updated 3 years ago
Alternatives and similar repositories for AEC-domain-corpora
Users that are interested in AEC-domain-corpora are comparing it to the libraries listed below
Sorting:
- 该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作☆63Updated last year
- VLE: Vision-Language Encoder (VLE: 视觉-语言多模态预训练模型)☆194Updated 2 years ago
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆314Updated last year
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆34Updated last year
- TechGPT: Technology-Oriented Generative Pretrained Transformer☆228Updated 2 years ago
- Generate dialog data from documents using LLM like ChatGLM2 or ChatGPT;利用ChatGLM2,ChatGPT等大模型根据文档生成对话数据集☆163Updated 2 years ago
- LLM for NER☆80Updated last year
- chinese document classification of layoutlmv3 and layoutxlm☆46Updated 3 years ago
- Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"☆46Updated 2 years ago
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆48Updated last year
- 文档方向分类☆224Updated last year
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆305Updated last year
- ICDAR 2024 Table OCR Model☆38Updated this week
- 本项目使用大语言模型(LLM)进行开放领域三元组抽取。☆32Updated 2 years ago
- An open-source and powerful Information Extraction toolkit based on GPT (GPT for Information Extraction; GPT4IE for short)。Note: we set a…☆176Updated 2 years ago
- Universal information extraction with instruction learning☆392Updated 10 months ago
- The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul ope…☆829Updated last year
- 中文版面检测(Chinese layout detection),yolov8 is used to detect the layout of Chinese document images。☆59Updated 2 years ago
- ☆273Updated 2 years ago
- QAonMilitaryKG,QaSystem based on military knowledge graph that stores in mongodb which is different from the previous one, 基于mongodb存储的军事…☆106Updated 6 years ago
- 通过浏览器渲染生成表格图像☆235Updated last year
- ☆108Updated 4 years ago
- ☆47Updated last year
- We released BERT-wwm, a Chinese pre-training model based on Whole Word Masking technology, and models closely related to this technology.…☆64Updated 2 years ago
- PaddleNLP UIE模型的PyTorch版实现☆676Updated 2 years ago
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆69Updated last year
- CDLA: A Chinese document layout analysis (CDLA) dataset☆287Updated 4 years ago
- [ACL 2024] IEPile: A Large-Scale Information Extraction Corpus☆210Updated last year
- SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding☆226Updated 2 years ago
- ☆135Updated 2 years ago