hbh112233abc / pdfplumberLinks
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
☆62Updated last year
Alternatives and similar repositories for pdfplumber
Users that are interested in pdfplumber are comparing it to the libraries listed below
Sorting:
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆305Updated last year
- clueai工具包: 3行代码3分钟,自定义需要的API!☆232Updated 2 years ago
- PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取☆214Updated 2 years ago
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆313Updated last year
- TechGPT: Technology-Oriented Generative Pretrained Transformer☆228Updated 2 years ago
- 基于sentence transformers和chatglm实现的文档搜索工具☆157Updated 2 years ago
- pke_zh, python keyphrase extraction for chinese(zh). 中文关键词或关键句提取工具,实现了KeyBert、PositionRank、TopicRank、TextRank等算法,开箱即用。☆216Updated last year
- 中文原生检索增强生成测评基准☆124Updated last year
- 自己手写的百度搜索接口的封装,pip安装,支持命令行执行。Baidu Search unofficial API for Python with no external dependencies☆157Updated last year
- 夫子•明察司法大模型是由山东大学、浪潮云、中国政法大学联合研发,以 ChatGLM 为大模型底座,基于海量中文无监督司法语料与有监督司法微调数据训练的中文司法大模型。该模型支持法条检索、案例分析、三段论推理判决以及司法对话等功能,旨在为用户提供全方位、高精准的法律咨询与解答…☆366Updated 6 months ago
- change pdf to txt☆68Updated 2 years ago
- ☆195Updated 11 months ago
- 利用LLM+敏感词库,来自动判别是否涉及敏感词。☆136Updated 2 years ago
- ☆44Updated 2 years ago
- 实现了Baichuan-Chat微调,Lora、QLora等各种微调方式,一键运行。☆71Updated 2 years ago
- 国内首个全参数训练的法律大模型 HanFei-1.0 (韩非)☆126Updated 2 years ago
- 活字通用大模型☆391Updated last year
- [COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集☆661Updated 2 years ago
- 语言模型中文认知能力分析☆236Updated 2 years ago
- 打造人人都会的NLP,开源不易,记得star哦☆101Updated 2 years ago
- SMP 2023 ChatGLM金融大模型挑战赛 60 分baseline思路介绍☆186Updated 2 years ago
- ChatGPT WebUI using gradio. 给 LLM 对话和检索知识问答RAG提供一个简单好用的Web UI界面☆139Updated last year
- 中文文本相似度计算器☆169Updated last year
- Based on RapidOCR, extract the PDF content☆184Updated 8 months ago
- basic framework for rag(retrieval augment generation)☆86Updated 2 years ago
- kbqa,langchain,large langauge model, chatgpt☆81Updated last year
- ☆67Updated last year
- 大语言模型指令调优工具(支持 FlashAttention)☆177Updated 2 years ago
- A Python Package to Access World-Class Generative Models☆131Updated last year
- llama信息抽取实战☆102Updated 2 years ago