python package to parse pdfs with different parsers
☆248Sep 12, 2025Updated 5 months ago
Alternatives and similar repositories for ParseStudio
Users that are interested in ParseStudio are comparing it to the libraries listed below
Sorting:
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,866Aug 25, 2025Updated 6 months ago
- Unlimited text-to-speech in the Browser using Kokoro-JS, 100% local, 100% open source☆328Jun 12, 2025Updated 8 months ago
- 小智ai机器人☆10Mar 8, 2025Updated last year
- The official repository of NodeRAG☆410Mar 19, 2025Updated 11 months ago
- The official repository of "Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling"☆12Nov 26, 2025Updated 3 months ago
- Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions☆10Jun 2, 2023Updated 2 years ago
- DB-based Optical Chemical Structure Recognition☆12Sep 12, 2022Updated 3 years ago
- Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical …☆648Updated this week
- A minimal Openclaw built using the Opencode SDK☆48Feb 7, 2026Updated last month
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆1,092Mar 2, 2026Updated last week
- A userspace filesystem backing by Apache OpenDAL.☆38Jan 8, 2026Updated 2 months ago
- Nebula docker image for development☆16Nov 7, 2025Updated 4 months ago
- ☆14Apr 28, 2025Updated 10 months ago
- POINTS-Reader train☆20Sep 20, 2025Updated 5 months ago
- Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval And Synthesis For SLMs☆55Oct 7, 2025Updated 5 months ago
- OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex lay…☆2,486Aug 4, 2025Updated 7 months ago
- Model Context Protocol Server for Apache OpenDAL™☆34Apr 10, 2025Updated 10 months ago
- A Deep Research agent from scratch☆216May 18, 2025Updated 9 months ago
- ☆22Apr 8, 2024Updated last year
- Apache OpenDAL Go Binding Services Releases☆15Sep 11, 2025Updated 5 months ago
- Polar is a secure and scalable knowledge graph framework, designed to address the challenges posed by building big data systems in highly…☆21Updated this week
- 根据word文档样式,自动生成报告,含word、PDF格式☆19Dec 1, 2020Updated 5 years ago
- 中文论文、证券类、财报类PDF数据☆37Jun 13, 2024Updated last year
- A lightweight LMM-based Document Parsing Model☆6,522Updated this week
- Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pret…☆715Feb 3, 2026Updated last month
- An agentic company research tool powered by LangGraph and Tavily that conducts deep diligence on companies using a multi-agent framework.…☆1,604Feb 27, 2026Updated last week
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆233Jan 21, 2026Updated last month
- VibeClip Pro 是一款面向创作者与效率爱好者的跨平台剪贴板控制台,整合本地历史、AI 快捷操作与深色/浅色视觉体系,让“复制 → 处理 → 粘贴”变成一次呼吸间的流程。☆43Dec 20, 2025Updated 2 months ago
- The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.☆16Sep 29, 2024Updated last year
- coze api to openai☆15Sep 1, 2024Updated last year
- 中国开发者活动日程(关注点:开源、开发者、云原生)☆23Feb 25, 2026Updated last week
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,433Jan 3, 2025Updated last year
- agentcp是一个基于ACP协议的Agent sdk,用于解决Agent间的身份认证及通信问题;用于创建AID、连接入网、构建会话,收发消息等;支持多Agent协作,异步消息处理,支持内网穿透,支持Agent访问的负载均衡☆31Feb 27, 2026Updated last week
- transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)☆18Apr 7, 2022Updated 3 years ago
- ☆25Oct 28, 2024Updated last year
- 中国制造!☆23Mar 2, 2026Updated last week
- Test Environment Booking tool☆14Nov 16, 2020Updated 5 years ago
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆637Jan 11, 2026Updated last month
- A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines☆5,414Updated this week