opendatalab / magic-doc
☆372Updated 4 months ago
Alternatives and similar repositories for magic-doc:
Users that are interested in magic-doc are comparing it to the libraries listed below
- The Open-Source Data Annotation Platform☆591Updated 3 weeks ago
- Data annotation toolbox supports image, audio and video data.☆884Updated last week
- ☆278Updated last week
- 万卷1.0多模态语料☆547Updated last year
- A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装,同时附带本地的文本处…☆200Updated this week
- Analysis of Chinese and English layouts 中英文版面分析☆134Updated last month
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆243Updated 2 months ago
- 📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。☆155Updated last month
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆200Updated 3 weeks ago
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆542Updated last month
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆212Updated 2 months ago
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post…☆339Updated this week
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆61Updated 3 weeks ago
- E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with ded…☆307Updated 2 months ago
- 文档方向分类☆204Updated last week
- 一个适合学习、使用、自主扩展的RAG【检索增强生成】系统!可联网做AI搜索☆399Updated 2 months ago
- A python native agent framework☆425Updated last week
- ☆116Updated last month
- 源自PP-Structure的表格识别算法,模型转换为ONNX,推理引擎采用ONNXRuntime,部署简单,无内存泄露问题。☆89Updated last week
- Based on RapidOCR, extract the PDF content.☆133Updated 3 months ago
- E2M API, converting everything to markdown (LLM-friendly Format).☆104Updated 4 months ago
- datasets resource☆90Updated 3 months ago
- 如需体验TextIn文档解析,请访问 https://cc.co/16YSIy☆108Updated last month
- LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA☆424Updated last month
- 🧠 世界上覆盖最全的优秀Qwen提示语大全,欢迎贡献你的提示词。🧠 The most comprehensive collection of excellent Qwen prompts in the world. Feel free to contribute you…☆144Updated last week
- GraphRAG的应用实例,项目特点在于提供了替换OpenAI模型的方法,并通过修改原有提示和切分文档的方法,提高了GraphRAG处理中文内容的能力。☆66Updated last month
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆202Updated this week
- Netease Youdao's open-source embedding and reranker models for RAG products.☆1,505Updated this week
- HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance☆1,532Updated last month
- conversion doc(pdf/html/doc/docx/ppt/pptx)to markdown☆34Updated 4 months ago