PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取
☆210Oct 17, 2023Updated 2 years ago
Alternatives and similar repositories for pdf_parsing
Users that are interested in pdf_parsing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ChatPDF Implement PDF parsing based on LangChain and LLM language model(ChatGLM,GPT...) | ChatPDF 基于LangChain和LLM语言模型实现PDF解析阅读☆55Jun 5, 2024Updated last year
- DB-based Optical Chemical Structure Recognition☆12Sep 12, 2022Updated 3 years ago
- 本项目旨在收集开源的表格智能任务数据集(比如表 格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。☆643Apr 22, 2024Updated 2 years ago
- Creating a graph that summarizes correlations between stocks and using a Graph Neural Network to encode that information to be utilized i…☆18May 19, 2023Updated 2 years ago
- 在RAG技术中,嵌入向量的生成和匹配是关键环节。本文介绍了一种基于CLIP/BLIP模型的嵌入服务,该服务支持文本和图像的嵌入生成与相似度计算,为多模态信息检索提供了基础能力。☆42Dec 28, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 由于BAAI/bge-large-zh 在Hugging Face Clone不下来,手动下载下来,便于使用☆11Sep 16, 2023Updated 2 years ago
- Based on RapidOCR, extract the PDF content☆188Mar 6, 2026Updated last month
- LangChain实现的基于PDF文档构建问答知识库☆39Apr 12, 2024Updated 2 years ago
- 文档方向分类☆221Feb 3, 2026Updated 2 months ago
- ☆33Jan 17, 2025Updated last year
- A hydraulic surrogate model and real-time control methods of urban drainage networks.☆38Jan 7, 2026Updated 3 months ago
- 中文CLIP:自定义数据集,可根据文图提取向量,实现文图匹配。☆22Sep 14, 2022Updated 3 years ago
- 大语言模型ChatGLM-6B为基座,接入文档阅读功能进行实时问答,可上传txt/docx/pdf多种文件类型。☆42Sep 11, 2023Updated 2 years ago
- Multi-Label Text Classification Based On Bert☆22Feb 28, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A simple, easy-to-hack GraphRAG implementation☆15Sep 21, 2024Updated last year
- 将微信读书划线和笔记同步到Readwise☆14Jun 1, 2023Updated 2 years ago
- Github repo for Peifeng's internship project☆13Nov 7, 2023Updated 2 years ago
- Official repository for "Unveiling Opinion Evolution via Prompting and Diffusion for Short Video Fake News Detection", ACL Findings 2024.☆15Apr 25, 2025Updated last year
- 基于InternLm chat 7B大模型基座,构建一个Agent ,可以调用 MMYOLO 工具来完成图像内视觉任务☆11Oct 30, 2024Updated last year
- FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。☆2,223May 8, 2024Updated last year
- 智谱AI 2024年金融行业大模型挑战赛仓库☆60Feb 19, 2025Updated last year
- Universal information extraction with instruction learning☆399Feb 28, 2025Updated last year
- 知识图谱基础设施☆11Jul 25, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Converted the Jina Tokenizer regex pattern to python.☆26Aug 26, 2024Updated last year
- A simple implement for multi-label text classification with Bert. I will extend the code to a higher version for very long text over 512,…☆12Jun 2, 2021Updated 4 years ago
- 使用opencv部署yolo11表格检测,它是百度网盘AI大赛-表格检测的第2名方案,方案里包含表格框检测,表格角点检测,表格方向分类,一共三个模块。我依然是编写了C++和Python两个版本的程序☆13Dec 12, 2024Updated last year
- 基于大语言模型的检索增强生成RAG示例☆174May 4, 2025Updated 11 months ago
- RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,支持GraphRAG,无须安装任何第三方agent库。☆847Apr 2, 2025Updated last year
- The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul ope…☆828May 28, 2024Updated last year
- baseline method for CROCS 2024☆10Jan 24, 2024Updated 2 years ago
- [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction☆4,376Jul 19, 2025Updated 9 months ago
- Question and Answer based on Anything.☆13,959Mar 24, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 信息抽取相关论文。☆78Apr 13, 2023Updated 3 years ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆308Sep 10, 2024Updated last year
- 通义千问的DPO训练☆65Sep 21, 2024Updated last year
- 基于cnstd+cnocr作为基础,封装的一个ocr的web服务☆10Nov 21, 2021Updated 4 years ago
- EARAM for fake news detection☆14Dec 30, 2025Updated 4 months ago
- Viscacha:通用信息抽取数据集收集☆27Feb 21, 2024Updated 2 years ago
- 基于LangGraph开发的智能体项目,可借助大模型自动调用工具规划旅游行程,包括景点搜索、交通查询、饭店酒店查询等功能☆41Aug 27, 2024Updated last year