hbh112233abc / pdfplumberLinks
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
☆61Updated last year
Alternatives and similar repositories for pdfplumber
Users that are interested in pdfplumber are comparing it to the libraries listed below
Sorting:
- clueai工具包: 3行代码3分钟,自定义需要的API!☆232Updated 2 years ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆305Updated last year
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆314Updated last year
- [COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集☆653Updated 2 years ago
- PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取☆214Updated 2 years ago
- pke_zh, python keyphrase extraction for chinese(zh). 中文关键词或关键句提取工具,实现了KeyBert、PositionRank、TopicRank、TextRank等算法,开箱即用。☆215Updated last year
- kbqa,langchain,large langauge model, chatgpt☆83Updated last year
- Python ROUGE Score Implementation for Chinese Language Task (official rouge score)☆111Updated last year
- 利用LLM+敏感词库,来自动判别是否涉及敏感词。☆136Updated 2 years ago
- basic framework for rag(retrieval augment generation)☆86Updated 2 years ago
- change pdf to txt☆68Updated 2 years ago
- 中文原生检索增强生成测评基准☆123Updated last year
- 活字通用大模型☆391Updated last year
- SMP 2023 ChatGLM金融大模型挑战赛 60 分baseline思路介绍☆186Updated 2 years ago
- Alpaca Chinese Dataset -- 中文指令微调数据集☆218Updated last year
- TechGPT: Technology-Oriented Generative Pretrained Transformer☆228Updated 2 years ago
- 基于sentence transformers和chatglm实现的文档搜索工具☆157Updated 2 years ago
- 基于开源embedding模型的中文向量效果测试☆146Updated 2 years ago
- 夫子•明察司法大模型是由山东大学、浪潮云、中国政法大学联合研发,以 ChatGLM 为大模型底座,基于海量中文无监督司法语料与有监督司法微调数据训练的中文司法大模型。该模型支持法条检索、案例分析、三段论推理判决以及司法对话等功能,旨在为用户提供全方位、高精准的法律咨询与解答…☆366Updated 4 months ago
- A Python Package to Access World-Class Generative Models☆131Updated last year
- 国内首个全参数训练的法律大模型 HanFei-1.0 (韩非)☆126Updated 2 years ago
- 专注于中文领域大语言模型,落地到某个行业某个领域,成为一个行业大模型、公司级别或行业级别领域大模型。☆126Updated 9 months ago
- "桃李“: 国际中文教育大模型☆189Updated 2 years ago
- SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding☆226Updated 2 years ago
- 语言模型中文认知能力分析☆236Updated 2 years ago
- 大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning☆73Updated last year
- 中文文本相似度计算器☆166Updated last year
- PromptCLUE, 全中文任务支持零样本学习模型☆665Updated 2 years ago
- 供AI训练的中文数据集(持续更新。。。)与AI公司图谱,目前的数据集餐饮行业8000问,百度知道,Alpaca中文数据集,计算机领域数据集,Vicuna数据集,RedPajama数据集,Wikipedia中文词条数据集,网站论坛问答数据集☆62Updated 2 years ago
- 基于ChatGPT构建的中文self-instruct数据集☆119Updated 2 years ago