E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with dedicated parsers and converters, supporting custom configs. E2M offers an all-in-one, flexible, and open-source solution.
☆1,274Sep 8, 2024Updated last year
Alternatives and similar repositories for e2m
Users that are interested in e2m are comparing it to the libraries listed below
Sorting:
- E2M API, converting everything to markdown (LLM-friendly Format).☆139Dec 12, 2024Updated last year
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆2,987Dec 8, 2025Updated 2 months ago
- AI reads books: Page-by-Page PDF Knowledge Extractor & Summarizer. script performs an intelligent page-by-page analysis of PDF books, met…☆1,577Jan 20, 2025Updated last year
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,804Dec 12, 2025Updated 2 months ago
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆55,275Updated this week
- The first open-source agent skills builder. Define skills by vibe workflow, run on Claude Code, Cursor, Codex & more. Build Clawdbot 🦞· …☆6,845Feb 28, 2026Updated last week
- Using GPT to parse PDF☆3,562Apr 17, 2025Updated 10 months ago
- Company Researcher tool helps you instantly understand any company inside out.☆1,412Feb 11, 2026Updated 3 weeks ago
- MemFree - Hybrid AI Search Engine & AI Page Generator☆1,490Aug 8, 2025Updated 6 months ago
- OCR & Document Extraction using vision models☆12,155May 20, 2025Updated 9 months ago
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,342Feb 21, 2025Updated last year
- Convert PDF to markdown + JSON quickly with high accuracy☆32,069Updated this week
- Turn local files into a prompt for an LLM☆177Jan 19, 2025Updated last year
- Python tool for converting files and office documents to Markdown.☆88,637Feb 20, 2026Updated 2 weeks ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,392Updated this week
- Task-Aware Agent-driven Prompt Optimization Framework☆3,805Oct 13, 2025Updated 4 months ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,433Jan 3, 2025Updated last year
- OpenSource Production ready Customer service with built in Evals and monitoring☆1,437Jan 12, 2026Updated last month
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆10,096May 8, 2025Updated 9 months ago
- RAG Web UI is an intelligent dialogue system based on RAG (Retrieval-Augmented Generation) technology.☆2,796Dec 8, 2025Updated 2 months ago
- ☆2,137Mar 17, 2025Updated 11 months ago
- Toolkit for linearizing PDFs for LLM datasets/training☆16,947Feb 19, 2026Updated 2 weeks ago
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser sandbox that lets you automate the web wit…☆6,494Feb 28, 2026Updated last week
- An open-source RAG-based tool for chatting with your documents.☆25,168Updated this week
- 📃 A better UX for chat, writing content, and coding with LLMs.☆5,389Feb 25, 2026Updated last week
- Detect and extract tables to markdown and csv☆754Jan 24, 2025Updated last year
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆2,940Sep 24, 2025Updated 5 months ago
- WhyHow Knowledge Graph Studio☆898Dec 25, 2024Updated last year
- An AI personal tutor built with Llama 3.1☆1,991Feb 27, 2026Updated last week
- This React component is used to render Markdown into a beautiful poster image, with support for copying as an image. Md to Poster/Image/Q…☆1,856Mar 5, 2025Updated last year
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,787Jul 4, 2025Updated 8 months ago
- [EMNLP 2025] OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking☆489Aug 23, 2025Updated 6 months ago
- Fetch an entire site and save it as a text file (to be used with AI models).☆1,643Jan 18, 2025Updated last year
- Make any LLM to think like OpenAI o1 and deepseek R1☆489Feb 6, 2025Updated last year
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆73,900Feb 28, 2026Updated last week
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,485Aug 27, 2025Updated 6 months ago
- 🔥 MaxKB is an open-source platform for building enterprise-grade agents. 强大易用的开源企业级智能体平台。☆20,227Updated this week
- Get your documents ready for gen AI☆54,754Updated this week
- ContextGem: Effortless LLM extraction from documents☆1,805Feb 22, 2026Updated last week