jayhenry / pdf2txt_mnbvcLinks
☆42Updated 2 years ago
Alternatives and similar repositories for pdf2txt_mnbvc
Users that are interested in pdf2txt_mnbvc are comparing it to the libraries listed below
Sorting:
- 国内首个全参数训练的法律大模型 HanFei-1.0 (韩非)☆124Updated last year
- SearchGPT: Building a quick conversation-based search engine with LLMs.☆46Updated 9 months ago
- ChatGPT WebUI using gradio. 给 LLM 对话和检索知识问答RAG提供一个简单好用的Web UI界面☆137Updated last year
- A Multi-Modal Dataset of Chinese Governmental Docunments☆37Updated 4 years ago
- 中文原生检索增强生成测评基准☆123Updated last year
- 语言模型中文认知能力分析☆236Updated 2 years ago
- TianGong-AI-Unstructure☆69Updated last week
- 中文世界的NLP自动标注开源工具,简单样本,交给LabelFast。☆77Updated 9 months ago
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆140Updated last year
- TechGPT 2.0: Technology-Oriented Generative Pretrained Transformer 2.0☆114Updated last year
- 文本去重☆76Updated last year
- 利用LLM+敏感词库,来自动判别是否涉及敏感词。☆131Updated 2 years ago
- ☆43Updated 2 years ago
- ☆67Updated last year
- Finetune Bloom big language model with Lora method☆32Updated 2 years ago
- A large-scale language model for scientific domain, trained on redpajama arXiv split☆136Updated last year
- 中文书籍收录整理, Collection of Chinese Books☆200Updated last year
- Silk Road will be the dataset zoo for Luotuo(骆驼). Luotuo is an open sourced Chinese-LLM project founded by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子…☆40Updated last year
- 专注于中文领域大语言模型,落地到某个行业某个领域,成为一个行业大模型、公司级别或行业级别领域大模型。☆123Updated 7 months ago
- [ACL 2024] IEPile: A Large-Scale Information Extraction Corpus☆204Updated 9 months ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆302Updated last year
- "桃李“: 国际中文教育大模型☆183Updated last year
- Legal-Eagle-InternLM 是一个基于商汤科技和上海人工智能实验室推出的书生浦语大模型InternLM的法律问答机器人。旨在为用户提供符合3H(即Helpful、Honest、Harmless)原则的专业、智能、全面的法律服务的法律领域大模型。☆61Updated last year
- MNBVC项目-ShareGPT语料清洗☆15Updated 2 years ago
- Imitate OpenAI with Local Models☆88Updated last year
- 基于sentence transformers和chatglm实现的文档搜索工具☆157Updated 2 years ago
- 本项目旨在对大量文本文件进行快速编码检测和转换以辅助mnbvc语料集项目的数据清洗工作☆65Updated this week
- ☆163Updated 2 years ago
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆312Updated last year
- Generate dialog data from documents using LLM like ChatGLM2 or ChatGPT;利用ChatGLM2,ChatGPT等大模型根据文档生成对话数据集☆159Updated last year