PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取
☆210Oct 17, 2023Updated 2 years ago
Alternatives and similar repositories for pdf_parsing
Users that are interested in pdf_parsing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ChatPDF Implement PDF parsing based on LangChain and LLM language model(ChatGLM,GPT...) | ChatPDF 基于LangChain和LLM语言模型实现PDF解析阅读☆55Jun 5, 2024Updated last year
- 《大语言模型》综述全书学习笔记☆12Aug 2, 2024Updated last year
- 本项目旨在收集 开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。☆643Apr 22, 2024Updated 2 years ago
- Baidu search API. Get baidu search results. Brother repository of MagicGoogle. 百度搜索API☆29Aug 7, 2018Updated 7 years ago
- Creating a graph that summarizes correlations between stocks and using a Graph Neural Network to encode that information to be utilized i…☆18May 19, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 在RAG技术中,嵌入向量的生成和匹配是关键环节。本文介绍了一种基于CLIP/BLIP模型的嵌入服务,该服务支持文本和图像的嵌入生成与相似度计算,为多模态信息检索提供了基础能力。☆42Dec 28, 2024Updated last year
- 由于BAAI/bge-large-zh 在Hugging Face Clone不下来,手动下载下来,便于使用☆11Sep 16, 2023Updated 2 years ago
- llama信息抽取实战☆101Apr 29, 2023Updated 3 years ago
- Based on RapidOCR, extract the PDF content☆188Mar 6, 2026Updated 2 months ago
- LangChain实现的基于PDF文档构建问答知识库☆39Apr 12, 2024Updated 2 years ago
- 中文CLIP:自定义数据集,可根据文图提取向量,实现文图匹配。☆22Sep 14, 2022Updated 3 years ago
- 大语言模型ChatGLM-6B为基座,接入文档阅读功能进行实时问答,可上传txt/docx/pdf多种文件类型。☆42Sep 11, 2023Updated 2 years ago
- Multi-Label Text Classification Based On Bert☆21Feb 28, 2023Updated 3 years ago
- A simple, easy-to-hack GraphRAG implementation☆15Sep 21, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 可以成功Lora微调的Qwen-VL模型☆16Oct 27, 2023Updated 2 years ago
- Github repo for Peifeng's internship project☆13Nov 7, 2023Updated 2 years ago
- Code for DBellQuant☆34Jan 30, 2026Updated 3 months ago
- 基于InternLm chat 7B大模型基座,构建一个Agent ,可以调用 MMYOLO 工具来完成图像内视觉任务☆11Oct 30, 2024Updated last year
- FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。☆2,230May 8, 2024Updated 2 years ago
- 智谱AI 2024年金融行业大模型挑战赛仓库☆60Feb 19, 2025Updated last year
- Universal information extraction with instruction learning☆400Feb 28, 2025Updated last year
- 知识图谱基础设施☆11Jul 25, 2022Updated 3 years ago
- Converted the Jina Tokenizer regex pattern to python.☆26Aug 26, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A simple implement for multi-label text classification with Bert. I will extend the code to a higher version for very long text over 512,…☆12Jun 2, 2021Updated 4 years ago
- Improving langchain knowledge graphs using baml☆44Aug 3, 2025Updated 9 months ago
- 在index-tts-vllm的基础上,实现了并提供了模拟流式合成音频的接口服务及客户端测试脚本☆26Sep 2, 2025Updated 8 months ago
- Conversational agents for engineering simulations with minimal human input using Microsoft AutoGen & GPT-4o.☆42Aug 4, 2024Updated last year
- 使用opencv部署yolo11表格检测,它是百度网盘AI大赛-表格检测的第2名方案,方案里包含表格框检测,表格角点检测,表格方向分类,一共三个模块。我依然是编写了C++和Python两个版本的程序☆13Dec 12, 2024Updated last year
- 基于大语言模型的检索增强生成RAG示例☆174May 4, 2025Updated last year
- RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,支持GraphRAG,无须安装任何第三方agent库。☆851Apr 2, 2025Updated last year
- The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul ope…☆828May 28, 2024Updated last year
- baseline method for CROCS 2024☆10Jan 24, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction☆4,399Jul 19, 2025Updated 10 months ago
- Question and Answer based on Anything.☆13,990Mar 24, 2025Updated last year
- 表格检测和表结构识别☆24Dec 5, 2022Updated 3 years ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆307Sep 10, 2024Updated last year
- 通义千问的DPO训练☆65Sep 21, 2024Updated last year
- 🔐Free GPT-3.5 chat with your docs (PDF, WORD, CSV, TXT)☆254Nov 13, 2023Updated 2 years ago
- This is an unofficial implementation to the EMNLP 2023 paper: Reading Order Matters: Information Extraction from Visually-rich Documents …☆16May 29, 2024Updated last year