SkydustZ / AEC-domain-corporaLinks
The code and dataset for the paper "Pretrained Domain-Specific Language Model for General Information Retrieval Tasks in the AEC Domain"
☆22Updated 2 years ago
Alternatives and similar repositories for AEC-domain-corpora
Users that are interested in AEC-domain-corpora are comparing it to the libraries listed below
Sorting:
- Automated rule transformation for automated rule checking☆32Updated 2 years ago
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆304Updated 9 months ago
- ☆130Updated last year
- Generate dialog data from documents using LLM like ChatGLM2 or ChatGPT;利用ChatGLM2,ChatGPT等大模型根据文档生成对话数据集☆157Updated last year
- 中文世界的NLP自动标注开源工具,简单样本,交给LabelFast。☆71Updated 4 months ago
- 基于大语言模型的检索增强生成RAG示例☆149Updated last month
- 文档方向分类☆219Updated 6 months ago
- 大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning☆64Updated 10 months ago
- 基于GlobalPointer的实体/关系/事件抽取☆147Updated 3 years ago
- 本项目使用大语言模型(LLM)进行开放领域三元组抽取。☆26Updated last year
- 该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作☆48Updated 9 months ago
- ☆16Updated 2 years ago
- 通过浏览器渲染生成表格图像☆218Updated last year
- 使用python自动构建知识图谱,百万、千万、亿万级别☆39Updated 2 years ago
- layoutlmv3 在中文文档上的应用☆18Updated 2 years ago
- 阿里天池: 2023全球智能汽车AI挑战赛——赛道一:AI大模型检索问答 baseline 80+☆105Updated last year
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆283Updated 8 months ago
- Integrating ONgDB database into langchain ecosystem☆77Updated 2 years ago
- LLM for NER☆73Updated 10 months ago
- Universal information extraction with instruction learning☆387Updated 3 months ago
- kbqa,langchain,large langauge model, chatgpt☆80Updated 7 months ago
- We released BERT-wwm, a Chinese pre-training model based on Whole Word Masking technology, and models closely related to this technology.…☆61Updated 2 years ago
- 基于pytorch的百度UIE命名实体 识别。☆56Updated 2 years ago
- An open-source and powerful Information Extraction toolkit based on GPT (GPT for Information Extraction; GPT4IE for short)。Note: we set a…☆174Updated 2 years ago
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆46Updated 11 months ago
- 利用指针网络进行信息抽取,包含命名实体识别、关系抽取、事件抽取。☆126Updated 2 years ago
- 中文版面检测(Chinese layout detection),yolov8 is used to detect the layout of Chinese document images。☆59Updated 2 years ago
- graphrag的基础架构☆35Updated 7 months ago
- QAonMilitaryKG,QaSystem based on military knowledge graph that stores in mongodb which is different from the previous one, 基于mongodb存储的军事…☆95Updated 6 years ago
- AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models☆439Updated last year