Here is a demo for PDF parser (Including OCR, object detection tools)
☆36Oct 14, 2024Updated last year
Alternatives and similar repositories for PDFparser
Users that are interested in PDFparser are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆19Jun 21, 2024Updated 2 years ago
- LINE: Large-scale Information Network Embedding in PyTorch☆17Jul 25, 2024Updated last year
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- Code for "An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought"☆18Jul 27, 2024Updated last year
- The imdb files with SBD-Trans OCR for TextVQA dataset.☆11Nov 30, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The implementation for CIKM 2024: Towards Completeness-Oriented Tool Retrieval for Large Language Models.☆26Nov 6, 2024Updated last year
- Repository for ICCV23 paper: "ReFit: Recurrent Fitting Network for 3D Human Recovery"☆99Apr 3, 2024Updated 2 years ago
- tools to operate kitti dataset, including point clouds projection, road segmentation, sparse-to-dense estimation and lane line detection.☆14Jun 22, 2022Updated 4 years ago
- Python implementation of AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, w…☆49Mar 22, 2025Updated last year
- Textin xParse Web 端集成 - React☆191Feb 24, 2026Updated 4 months ago
- ☆14Feb 3, 2022Updated 4 years ago
- Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"☆48Sep 5, 2023Updated 2 years ago
- ☆15Nov 22, 2023Updated 2 years ago
- TXT文本语料数据清洗(Text corpus data cleaning):1> 合并TXT文件;2> 过滤干扰字符串;3> 对人名、地名、组织机构进行遮码处理;4> 将其他编码格式统一转换为UTF-8☆19Oct 14, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆305Sep 10, 2024Updated last year
- ☆11Oct 31, 2024Updated last year
- 最近几年三维场景表示相关工作的收集列表,重点关注深度学习相关的工作,包括Neural Radiance Field(NeRF),Signed Distance Funciton(SDF),Occupancy Field以及3D Gaussian Splatting等。不仅包…☆13Dec 16, 2023Updated 2 years ago
- Agentica: Lightweight async-first Python framework for AI agents. 轻量级异步优先的AI Agent框架,支持工具调用、RAG、多智能体和MCP。☆317Updated this week
- [ACL '25] Source code for our paper ''RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts''☆53Nov 27, 2025Updated 7 months ago
- A curated list of resources dedicated to face recognition.☆16Aug 4, 2018Updated 7 years ago
- This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to a…☆30Feb 23, 2025Updated last year
- 大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning☆80Jul 25, 2024Updated last year
- ☆14Jul 13, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Tracking the hot Github repos and update daily 每天自动追踪Github热门项目☆52Updated this week
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆70Jul 24, 2024Updated last year
- pinyintokenizer, 拼音分词器,将连续的拼音切分为单字拼音列表。☆31Feb 5, 2025Updated last year
- ☆21Nov 19, 2023Updated 2 years ago
- Compute benchmark of table structure recognition.☆30Dec 2, 2025Updated 6 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆62Nov 8, 2024Updated last year
- EMNLP 2024 | Style-Specific Neurons for Steering LLMs in Text Style Transfer☆14Mar 23, 2025Updated last year
- IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, app…☆21Nov 7, 2025Updated 7 months ago
- SearchGPT: Building a quick conversation-based search engine with LLMs.☆45Jan 5, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- AIGC 知识库问答系统快速搭建,便于企业级定制化,支持文档上传,向量存储,聊天式问答。☆36Nov 11, 2023Updated 2 years ago
- A pipeline for the automatic construction of geometry problems along with step-by-step solutions.☆17Aug 27, 2025Updated 10 months ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆37Mar 26, 2024Updated 2 years ago
- 表格结构识别LGPMA推理☆25Nov 17, 2022Updated 3 years ago
- ☆11Mar 28, 2023Updated 3 years ago
- A template of tensorflow projects to maximize code reuse.☆16Jan 30, 2019Updated 7 years ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆75Jun 22, 2026Updated last week