python package to parse pdfs with different parsers
☆249Sep 12, 2025Updated 7 months ago
Alternatives and similar repositories for ParseStudio
Users that are interested in ParseStudio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,960Mar 17, 2026Updated last month
- Unlimited text-to-speech in the Browser using Kokoro-JS, 100% local, 100% open source☆334Jun 12, 2025Updated 10 months ago
- Fast, zero-copy HTML Parser written in Rust☆27Dec 6, 2025Updated 4 months ago
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆1,108Updated this week
- The minimal, ad-hoc way of plug and play NebulaGraph with pip install, even inside Colab Notebook!☆21May 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Parallel and LAzY Analyzer for PDFs 🏖️☆41Mar 9, 2026Updated last month
- A tutorial on DSPy and whether automated prompt engineering lives up to the hype☆26May 3, 2024Updated last year
- The official repository of NodeRAG☆412Mar 19, 2025Updated last year
- ☆30May 9, 2025Updated 11 months ago
- A Deep Research agent from scratch☆219May 18, 2025Updated 11 months ago
- The Level-Navi Agent, a framework that requires no training and utilizes large language models for deep query understanding and precise s…☆82Dec 27, 2024Updated last year
- A set of tools to create synthetically-generated data from documents☆46Aug 15, 2025Updated 8 months ago
- Docling workshops☆42Updated this week
- Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical …☆658Apr 7, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 新 React 中文文档 docker 版本,方便本地部署看文档☆13Updated this week
- OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex lay…☆2,491Updated this week
- A lightweight LMM-based Document Parsing Model☆6,581Apr 1, 2026Updated 2 weeks ago
- 影视分镜大师☆46Nov 27, 2025Updated 4 months ago
- (WIP) various language support for libpglite native☆22Aug 5, 2025Updated 8 months ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,579Jan 3, 2025Updated last year
- A simple web-based Docker container management interface with a modern design. This application provides a fast and intuitive way to star…☆113Mar 17, 2026Updated last month
- ☆22Feb 1, 2025Updated last year
- Customize your arXiv recommendation every day.☆147Sep 24, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pret…☆721Mar 6, 2026Updated last month
- An agentic company research tool powered by LangGraph and Tavily that conducts deep diligence on companies using a multi-agent framework.…☆1,672Updated this week
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆58Jan 27, 2025Updated last year
- Nebula docker image for development☆16Apr 1, 2026Updated 2 weeks ago
- ☆84Mar 6, 2026Updated last month
- The official repository of "Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling"☆14Nov 26, 2025Updated 4 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆236Mar 27, 2026Updated 3 weeks ago
- A userspace filesystem backing by Apache OpenDAL.☆36Jan 8, 2026Updated 3 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆264Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Fetch arxiv data to LLM-friendly text☆131Feb 18, 2026Updated 2 months ago
- Test Environment Booking tool☆14Nov 16, 2020Updated 5 years ago
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆656Jan 11, 2026Updated 3 months ago
- Model Context Protocol Server for Apache OpenDAL™☆34Apr 10, 2025Updated last year
- A minimal Openclaw built using the Opencode SDK☆80Feb 7, 2026Updated 2 months ago
- coze api to openai☆15Sep 1, 2024Updated last year
- 小智ai机器人☆10Mar 8, 2025Updated last year