WalkerMitty / PDFparserLinks
Here is a demo for PDF parser (Including OCR, object detection tools)
☆35Updated 10 months ago
Alternatives and similar repositories for PDFparser
Users that are interested in PDFparser are comparing it to the libraries listed below
Sorting:
- ☆28Updated 10 months ago
- 中文原生检索增强生成测评基准☆121Updated last year
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆47Updated last year
- TianGong-AI-Unstructure☆69Updated 2 months ago
- ☆37Updated 4 months ago
- YiZhao: A 2TB Open Financial Corpus. Data and tools for generating and inspecting YiZhao, a safe, high-quality, open-source bilingual fin…☆30Updated last month
- A Toolkit for Table-based Question Answering☆113Updated last year
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆70Updated last year
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆37Updated 11 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆118Updated 2 months ago
- 视觉信息抽取任务中,使用OCR识别结果规范多模态大模型的回答☆39Updated 7 months ago
- ☆57Updated last year
- 基于baichuan-7b的开源多模态大语言模型☆72Updated last year
- made RAG pipeline better in table data☆99Updated 10 months ago
- Recursive Abstractive Processing for Tree-Organized Retrieval☆10Updated last year
- 1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc☆161Updated last month
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆25Updated last year
- PDF解析工具:GOT的vLLM加速实现,MinerU做布局识别裁剪、GOT做表格公式解析,实现RAG中的pdf解析☆62Updated 9 months ago
- ☆19Updated last year
- Generate dialog data from documents using LLM like ChatGLM2 or ChatGPT;利用ChatGLM2,ChatGPT等大模型根据文档生成对话数据集☆159Updated last year
- 想要从零开始训练一个中文的mini大语言模型,可以进行基本的对话,模型大小根据手头的机器决定☆61Updated last year
- SearchGPT: Building a quick conversation-based search engine with LLMs.☆47Updated 7 months ago
- Python implementation of AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, w…☆47Updated 5 months ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Updated last year
- ☆15Updated last year
- LLM+RAG for QA☆23Updated last year
- 大语言模型ChatGLM-6B为基座,接入文档阅读功能进行实时问答,可上传txt/docx/pdf多种文件类型。☆41Updated last year
- 中文论文、证券类、财报类PDF数据☆34Updated last year
- [ACL24] Official repo for "Synthesizing Text-to-SQL Data from Weak and Strong LLMs"☆67Updated last year