WalkerMitty/PDFparser

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WalkerMitty/PDFparser)

WalkerMitty / PDFparser

Here is a demo for PDF parser (Including OCR, object detection tools)

☆36

Alternatives and similar repositories for PDFparser

Users that are interested in PDFparser are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wux-labs / OpenXLab-IntelligentSalesAssistant
View on GitHub
☆19Jun 21, 2024Updated 2 years ago
DunZhang / Jasper-Token-Compression-Training
View on GitHub
The training codes of Jasper-Token-Compression-600M
☆20Nov 19, 2025Updated 8 months ago
taolusi / SECURE
View on GitHub
ACL'2024-Main: Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Languag…
☆12Sep 19, 2025Updated 10 months ago
thu-spmi / RAG-CoT
View on GitHub
Code for "An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought"
☆18Jul 27, 2024Updated last year
shibing624 / deep-research
View on GitHub
Python implementation of AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, w…
☆49Mar 22, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
zouzhenhong98 / kitti-tools
View on GitHub
tools to operate kitti dataset, including point clouds projection, road segmentation, sparse-to-dense estimation and lane line detection.
☆14Jun 22, 2022Updated 4 years ago
chenyangMl / JointBERT-zh
View on GitHub
Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"
☆48Sep 5, 2023Updated 2 years ago
Pillars-Creation / ChatGLM-RLHF-LoRA-RM-PPO
View on GitHub
ChatGLM-6B添加了RLHF的实现，以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成，以及指定context推荐的RLHF的实现
☆88Aug 16, 2023Updated 2 years ago
iimmortall / QuantLib
View on GitHub
☆14Feb 3, 2022Updated 4 years ago
193746 / VHASR
View on GitHub
☆11Oct 31, 2024Updated last year
OmarSamirz / ImageFromTextGenerator
View on GitHub
IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, app…
☆21Nov 7, 2025Updated 8 months ago
Toon-nooT / notebooks
View on GitHub
☆17Jul 30, 2024Updated last year
jiangnanboy / llm_corpus_quality
View on GitHub
大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning
☆80Jul 25, 2024Updated last year
VimalWill / Vstream
View on GitHub
Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)
☆10Feb 2, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
360AILAB-NLP / 360LayoutAnalysis
View on GitHub
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
☆305Sep 10, 2024Updated last year
KomeijiForce / MetaIE
View on GitHub
This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to a…
☆30Feb 23, 2025Updated last year
shibing624 / github-hot
View on GitHub
Tracking the hot Github repos and update daily 每天自动追踪Github热门项目
☆52Updated this week
shibing624 / pinyin-tokenizer
View on GitHub
pinyintokenizer, 拼音分词器，将连续的拼音切分为单字拼音列表。
☆31Feb 5, 2025Updated last year
shibing624 / SearchGPT
View on GitHub
SearchGPT: Building a quick conversation-based search engine with LLMs.
☆45Jan 5, 2025Updated last year
ianhohoho / auto-hyde
View on GitHub
🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…
☆37Mar 26, 2024Updated 2 years ago
Mungeryang / colqwen3
View on GitHub
The code used to train and run inference with the ColQwen3 model. Welcome to follow and star! ⭐️⭐️⭐️ https://huggingface.co/goodman2001/…
☆15Jul 4, 2026Updated 2 weeks ago
SWHL / LGPMA_Infer
View on GitHub
表格结构识别LGPMA推理
☆25Nov 17, 2022Updated 3 years ago
osome-iu / ChatGPT_domain_rating
View on GitHub
Code and data for paper "Large language models can rate news outlet credibility"
☆13Aug 10, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mbossX / rtmp-HK
View on GitHub
海康威视ip摄像头推rtmp流到srs服务器
☆11Apr 16, 2018Updated 8 years ago
EngSalem / HaLo
View on GitHub
☆16Sep 27, 2023Updated 2 years ago
llm-semantic-router / vllm-router
View on GitHub
vLLM Router
☆55Mar 11, 2024Updated 2 years ago
Paul33333 / Agentic_RAG
View on GitHub
Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API
☆17Jun 21, 2025Updated last year
SCUT-DLVCLab / SCUT-EnsExam
View on GitHub
SCUT-EnsExam is a real-world handwritten text erasure dataset for examination paper scenarios, which consists of 545 examination paper im…
☆21Updated this week
hbzju / SoLar
View on GitHub
Source code for NeurIPS 2022 paper SoLar
☆30Dec 20, 2023Updated 2 years ago
Gokulraj0906 / GQNN
View on GitHub
GQNN is a pioneering Python library designed for research and experimentation with Generalized Quantum Neural Networks (GQNNs). By integr…
☆19Jun 23, 2026Updated 3 weeks ago
ssu-humane / fake-news-thumbnail
View on GitHub
A dataset and CLIP baseline for unrepresentative news thumbnail detection (ACL 2022 workshop)
☆12May 26, 2022Updated 4 years ago
y2kiah / project-griffin
View on GitHub
project griffin is a real time, high performance multithreaded 3d graphics engine and game project, utilizing C++11, LuaJIT, and OpenGL 4…
☆12Dec 6, 2019Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HydroXai / Enhancing-Safety-in-Large-Language-Models
View on GitHub
Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…
☆11Nov 26, 2024Updated last year
shirley-wu / cross-doc-misinfo-detection
View on GitHub
☆22Dec 13, 2023Updated 2 years ago
INK-USC / FaiRR
View on GitHub
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language (ACL 2022)
☆14May 19, 2022Updated 4 years ago
dolab / httptesting
View on GitHub
HTTP testing framework of golang for human.
☆14Mar 12, 2025Updated last year
wanbiguizhao / layoutlmv3_zh
View on GitHub
layoutlmv3 在中文文档上的应用
☆21May 17, 2023Updated 3 years ago
shibing624 / weibo-roast
View on GitHub
一个微博毒舌AI，疯狂 diss 微博博主
☆15Jan 2, 2025Updated last year
philcn / OrderIndependentTransparency
View on GitHub
Cinder port of https://github.com/gangliao/Order-Independent-Transparency-GPU
☆15Sep 22, 2018Updated 7 years ago