magicpdf / Magic-Doc
conversion doc(pdf/html/doc/docx/ppt/pptx)to markdown
☆37Updated 6 months ago
Alternatives and similar repositories for Magic-Doc:
Users that are interested in Magic-Doc are comparing it to the libraries listed below
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆262Updated 5 months ago
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆21Updated 2 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆73Updated 3 months ago
- Imitate OpenAI with Local Models☆86Updated 5 months ago
- TianGong-AI-Unstructure☆58Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆129Updated 2 months ago
- ☆62Updated 5 months ago
- Analysis of Chinese and English layouts 中英文版面分析☆166Updated last month
- ☆25Updated 4 months ago
- ☆225Updated 9 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆139Updated 8 months ago
- Mixture-of-Experts (MoE) Language Model☆184Updated 5 months ago
- 中文原生检索增强生成测评基准☆109Updated 10 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 10 months ago
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆46Updated 3 months ago
- 文本去重☆68Updated 8 months ago
- ☆168Updated 2 months ago
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆46Updated 8 months ago
- Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 轻松构建智能、具备反思能力、可协作的多模态AI Agent。☆129Updated last month
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆125Updated 2 months ago
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆215Updated this week
- Dingo: A Comprehensive Data Quality Evaluation Tool☆36Updated this week
- ☆105Updated last year
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆167Updated 8 months ago
- code for piccolo embedding model from SenseTime☆119Updated 9 months ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligence☆138Updated 5 months ago
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆35Updated 2 months ago
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆135Updated 10 months ago
- SearchGPT: Building a quick conversation-based search engine with LLMs.☆44Updated last month